Skip to main content

Levmckinney's group workspace

Timestamps visible
2023-05-07 18:30:11
Using 'LlamaDecoderLayer' for transformer_auto_wrap_policy.
2023-05-07 18:30:31
Gradient accumulation steps: 32
2023-05-07 18:30:31
Using 262_144 tokens per training step.
2023-05-07 18:30:31
All processes have completed setup. Starting training.
2023-05-07 21:21:25
Training: 100%|██████████| 8000/8000 [2:50:52<00:00,  1.28s/it]
2023-05-07 21:21:24
Saving lens to /output/huggyllama/llama-7b-1683483775