Skip to main content

Kastan's group workspace

Aug-05__14:16

What makes this group special?
Tags
Notes
Tags
Aug-05__14:16
BATCH_SIZE1280
NUM_EPOCHS=60
NUM_MICRO_BATCHES=8
SLURM=513930
TP=16
WORLD_SIZE=32
Author
State
Crashed
Start time
August 5th, 2022 7:18:51 PM
Runtime
27m 37s
Tracked hours
-
Run path
kastan/LLM-Distributed-Quantization/glggu52z
OS
Linux-4.18.0-305.49.1.el8_4.x86_64-x86_64-with-glibc2.28
Python version
3.9.12
Command
/u/kastanday/LLM-Distributed-Quantization/benchmarks/gpt/v2_train.py --config /u/kastanday/LLM-Distributed-Quantization/benchmarks/gpt/configs/q_allBF16_gpt2_8b_2p5d_256.py --host gpub032 --port 29500 --world_size 32 --rank 9
System Hardware
CPU count64
GPU count4
GPU typeNVIDIA A40
W&B CLI Version
0.13.0
Config

Config parameters are your model's inputs. Learn more

  • {} 24 keys
    • 1,280
    • 1
    • "col_ai_quant"
    • "/u/kastanday/LLM-Distributed-Quantization/datasets/small-gpt-dataset.json"
    • 1
    • 0.00015
    • "./gpt2_2.5d_tp16_bs1280_lr0.00015_accum1_clip_grad1.0/"
    • {} 1 key
      • "titans.loss.lm_loss.gpt_lmloss.GPTLMLoss"
    • {} 8 keys
      • {} 4 keys
        • "torch.bfloat16"
        • "torch.bfloat16"
        • "torch.bfloat16"
        • "torch.bfloat16"
      • 60
      • "4"
      • 8
      • {} 2 keys
        • 0.00015
        • 0.01
      • {} 2 keys
        • 2
        • {} 3 keys
          • 1
          • "2.5d"
          • 16
      • "titans.model.quant_gpt.quant_gpt.quant_gpt2_8B"
      • 1,024
      • "2.5d"
      • 16
      • 1,280
      • "32"
      • 50,304
      • 21
      • 0.01
    Summary

    Summary metrics are your model's outputs. Learn more

    No summary metrics saved for this run.

    Check the summary metrics documentation for more information.

    Artifact Outputs

    This run produced these artifacts as outputs. Total: 1. Learn more

    Loading...