Skip to main content

Levmckinney's group workspace

Timestamps visible
2023-05-05 15:19:47
Using 'GPTNeoXLayer' for transformer_auto_wrap_policy.
2023-05-05 15:20:29
Gradient accumulation steps: 32
2023-05-05 15:20:29
Using 262_144 tokens per training step.
2023-05-05 15:20:29
All processes have completed setup. Starting training.
2023-05-05 20:33:32
Training: 100%|██████████| 8000/8000 [5:13:03<00:00,  2.35s/it]
2023-05-05 20:33:32
Saving lens to /output/EleutherAI/pythia-12b-deduped-1683299166