Chilli's group workspace
K3ZsCJD4ynEa5rxJF75NcygEVFMJLKVRSTW4JQPsJa56
What makes this group special?
Tags
neox-visual-grounding-0-0
Notes
Author
State
Crashed
Start time
April 27th, 2021 6:09:41 PM
Runtime
8m
Tracked hours
7m 48s
Run path
eleutherai/neox/1mfa8421
OS
Linux-5.4.0-54-generic-x86_64-with-glibc2.29
Python version
3.8.5
Git repository
git clone https://github.com/EleutherAI/gpt-neox.git
Git state
git checkout -b "neox-visual-grounding-0-0" 9992042ab113428022e5e91421c04917577b8e00
Command
pretrain_gpt2.py --local_rank=0 --num_gpus 6 --deepspeed_config "{\"train_batch_size\": 96, \"train_micro_batch_size_per_gpu\": 16, \"optimizer\": {\"type\": \"Adam\", \"params\": {\"lr\": 0.00025, \"betas\": [0.9, 0.999], \"eps\": 1e-08}}, \"fp16\": {\"fp16\": true, \"enabled\": true, \"loss_scale\": 0, \"loss_scale_window\": 1000, \"hysteresis\": 2, \"min_loss_scale\": 1}, \"gradient_clipping\": 1.0, \"zero_optimization\": {\"stage\": 1, \"allgather_partitions\": true, \"allgather_bucket_size\": 500000000, \"overlap_comm\": true, \"reduce_scatter\": true, \"reduce_bucket_size\": 500000000, \"contiguous_gradients\": true, \"cpu_offload\": false}, \"wall_clock_breakdown\": true}" --megatron_config "{\"num_gpus\": 6, \"train_batch_size\": 96, \"train_micro_batch_size_per_gpu\": 16, \"optimizer\": {\"type\": \"Adam\", \"params\": {\"lr\": 0.00025, \"betas\": [0.9, 0.999], \"eps\": 1e-08}}, \"fp16\": {\"fp16\": true, \"enabled\": true, \"loss_scale\": 0, \"loss_scale_window\": 1000, \"hysteresis\": 2, \"min_loss_scale\": 1}, \"gradient_clipping\": 1.0, \"zero_optimization\": {\"stage\": 1, \"allgather_partitions\": true, \"allgather_bucket_size\": 500000000, \"overlap_comm\": true, \"reduce_scatter\": true, \"reduce_bucket_size\": 500000000, \"contiguous_gradients\": true, \"cpu_offload\": false}, \"wall_clock_breakdown\": true, \"precision\": \"fp16\", \"num_layers\": 24, \"hidden_size\": 1536, \"num_attention_heads\": 16, \"seq_length\": 2048, \"max_position_embeddings\": 2048, \"pos_emb\": \"rotary\", \"no_weight_tying\": true, \"lr_decay_style\": \"cosine\", \"lr_decay_iters\": 320000, \"optimizer_type\": \"Adam\", \"zero_stage\": 1, \"zero_reduce_scatter\": true, \"zero_contiguous_gradients\": true, \"zero_reduce_bucket_size\": 500000000, \"zero_allgather_bucket_size\": 500000000, \"lr\": 0.00025, \"data_path\": \"data/enwik8/enwik8_text_document\", \"data_impl\": \"mmap\", \"save\": \"checkpoints/\", \"load\": \"checkpoints/\", \"save_interval\": 10000, \"batch_size\": 16, \"train_iters\": 320000, \"eval_iters\": 10, \"keep_last_n_checkpoints\": 4, \"split\": \"900,99,1\", \"vocab_file\": \"data/gpt2-vocab.json\", \"merge_file\": \"data/gpt2-merges.txt\", \"attention_dropout\": 0, \"hidden_dropout\": 0, \"weight_decay\": 0, \"checkpoint_activations\": true, \"synchronize_each_layer\": true, \"partition_activations\": true, \"gas\": 1, \"clip_grad\": 1.0, \"dynamic_loss_scale\": true, \"pipe_parallel_size\": 1, \"world_size\": 1, \"wandb_group\": \"K3ZsCJD4ynEa5rxJF75Ncy\", \"log_dir\": \"logs/\", \"tensorboard_dir\": \"/mnt/ssd-cluster/tensorboard\", \"log_interval\": 100, \"local_rank\": 0, \"rank\": 0, \"user_script\": \"pretrain_gpt2.py\"}"
System Hardware
CPU count | 112 |
GPU count | 6 |
GPU type | A100-PCIE-40GB |
W&B CLI Version
0.10.25
Config
Config parameters are your model's inputs. Learn more
- {} 162 keys▶
- false
- 1,000
- null
- false
- false
- 0
- false
- 16
- false
- false
- true
- false
- 1
- 1
- false
- "mmap"
- "data/enwik8/enwik8_text_document"
- false
- null
- true
- false
- false
- false
- false
- "nccl"
- null
- null
- null
- false
- true
- false
- 1,000
- 10
- null
- null
- false
- null
- {} 6 keys▶
- false
- false
- 1
- false
- null
- "9992042"
- 1
- 1
- {} 8 keys▶
- 500,000,000
- true
- 1
46 ... 95▶▶96 ... 145▶▶146 ... 157▶▶
Summary
Summary metrics are your model's outputs. Learn more
- {} 3 keys▶
- 0.000002578125
- 9.885313987731934
- 65,536
Artifact Outputs
This run produced these artifacts as outputs. Learn more
Type
Name
Consumer count
Loading...