Skip to main content

Kastan's group workspace

Timestamps visible
2022-06-21 15:19:46
Error when collecting samples_per_sec, Tflops... not enough values to unpack (expected 2, got 1)
2022-06-21 15:19:48
Error when collecting samples_per_sec, Tflops... not enough values to unpack (expected 2, got 1)
2022-06-21 15:20:00
Error when collecting samples_per_sec, Tflops... not enough values to unpack (expected 2, got 1)
2022-06-21 15:20:16
[06/21/22 11:20:15] INFO     colossalai - colossalai - INFO:
2022-06-21 15:20:16
                             /u/kastan/.conda/envs/pytorch_1.11_colossal/lib/python3.8/site-packages/colossalai/trainer/hooks/_log_hook.py:97 after_train_epoch
2022-06-21 15:20:16
                    INFO     colossalai - colossalai - INFO: [Epoch 0 / Train]: Loss = 28.543 | LR = 5e-05 | Throughput = 15.748
2022-06-21 15:20:16
                    INFO     colossalai - colossalai - INFO:
2022-06-21 15:20:16
                             /u/kastan/.conda/envs/pytorch_1.11_colossal/lib/python3.8/site-packages/colossalai/utils/memory.py:91 report_memory_usage
2022-06-21 15:20:16
                    INFO     colossalai - colossalai - INFO: [Epoch 0 / Train]: GPU: allocated 1091.68 MB, max allocated 1320.23 MB, cached: 1668.0 MB, max
2022-06-21 15:20:16
                             cached: 1668.0 MB
2022-06-21 15:20:16
                    INFO     colossalai - colossalai - INFO:
2022-06-21 15:20:16
                             /u/kastan/.conda/envs/pytorch_1.11_colossal/lib/python3.8/site-packages/colossalai/trainer/_trainer.py:341 fit
2022-06-21 15:20:16
                    INFO     colossalai - colossalai - INFO: Max number of steps 2000 has been reached, training is stopped automatically
2022-06-21 15:20:16
                    INFO     colossalai - colossalai - INFO: train_gpt.py:221 after_train
2022-06-21 15:20:16
                    INFO     colossalai - colossalai - INFO: training finished