Skip to main content

Kastan's group workspace

Timestamps visible
2022-07-27 18:34:14
Error when collecting samples_per_sec, Tflops... not enough values to unpack (expected 2, got 1)
2022-07-27 18:34:14
[07/27/22 13:34:13] INFO     colossalai - colossalai - INFO:
2022-07-27 18:34:14
                             /u/kastanday/.conda/envs/nice_base/envs/col_ai_old_v5/lib/python3.9/site-packages/colossalai/trainer/hooks/
2022-07-27 18:34:14
                             _log_hook.py:97 after_train_epoch
2022-07-27 18:34:14
                    INFO     colossalai - colossalai - INFO: [Epoch 0 / Train]: Loss = 92.713 | LR = 5.6041e-05 | Throughput = 10.176
2022-07-27 18:34:14
                    INFO     colossalai - colossalai - INFO:
2022-07-27 18:34:14
                             /u/kastanday/.conda/envs/nice_base/envs/col_ai_old_v5/lib/python3.9/site-packages/colossalai/trainer/_train
2022-07-27 18:34:14
                             er.py:341 fit
2022-07-27 18:34:14
                    INFO     colossalai - colossalai - INFO: Max number of steps 200 has been reached, training is stopped automatically
2022-07-27 18:34:14
[07/27/22 13:34:14] INFO     colossalai - colossalai - INFO:
2022-07-27 18:34:14
                             /u/kastanday/new_colossal_ai/ColossalAI/benchmark/gpt/custom_wandb_log_hook.py:74 after_train
2022-07-27 18:34:14
                    INFO     colossalai - colossalai - INFO: training finished
2022-07-27 18:34:14
                    INFO     colossalai - colossalai - INFO: /u/kastanday/new_colossal_ai/ColossalAI/benchmark/gpt/train.py:125
2022-07-27 18:34:14
                             train_gpt
2022-07-27 18:34:14
                    INFO     colossalai - colossalai - INFO: Rank 0: peak memory usage = 28.01 GB.