Kastan's group workspace
Aug-05__14:10
What makes this group special?
Tags
gpt2_8b_2p5d_256
Notes
Tags
Aug-05__14:10
BATCH_SIZE1280
NUM_EPOCHS=60
NUM_MICRO_BATCHES=8
SLURM=513927
TP=16
WORLD_SIZE=32
Author
State
Crashed
Start time
August 5th, 2022 7:11:48 PM
Runtime
29m
Tracked hours
-
Run path
kastan/LLM-Distributed-Quantization/3ho4lznl
OS
Linux-4.18.0-305.49.1.el8_4.x86_64-x86_64-with-glibc2.28
Python version
3.9.12
Command
/u/kastanday/LLM-Distributed-Quantization/benchmarks/gpt/v2_train.py --config /u/kastanday/LLM-Distributed-Quantization/benchmarks/gpt/configs/gpt2_8b_2p5d_256.py --host gpub003 --port 29500 --world_size 32 --rank 7
System Hardware
| CPU count | 64 |
| GPU count | 4 |
| GPU type | NVIDIA A40 |
W&B CLI Version
0.13.0
Group
Aug-05__14:10Config
Config parameters are your model's inputs. Learn more
- {} 25 keys▶
- 1,280
- 1
- "col_ai_quant"
- "/u/kastanday/LLM-Distributed-Quantization/datasets/small-gpt-dataset.json"
- {} 1 key▶
- "AMP_TYPE.NAIVE"
- "titans.model.gpt.gpt.gpt2_8B"
- "titans.model.gpt.gpt.gpt2_xl"
- 1
- 0.00015
- "./gpt2_2.5d_tp16_bs1280_lr0.00015_accum1_clip_grad1.0/"
- {} 1 key▶
- "titans.loss.lm_loss.gpt_lmloss.GPTLMLoss"
- {} 5 keys▶
- 60
- "4"
- 8
- {} 2 keys▶
- 0.00015
- 0.01
- {} 2 keys▶
- 2
- {} 3 keys▶
- 1
- "2.5d"
- 16
- 1,024
- "2.5d"
- 16
- 1,280
- "32"
- 50,304
- 21
- 0.01
Summary
Summary metrics are your model's outputs. Learn more
No summary metrics saved for this run.
Check the summary metrics documentation for more information.
Artifact Outputs
This run produced these artifacts as outputs. Total: 1. Learn more
Type
Name
Consumer count
Loading...