Skip to main content

Kastan's group workspace

Timestamps visible
2022-07-27 20:24:04
[07/27/22 15:24:02] INFO     colossalai - colossalai - INFO: /u/kastanday/LLM-Distributed-Quantization/logging/custom_wandb_log_hook.py:77 before_train
2022-07-27 20:24:04
                    INFO     colossalai - colossalai - INFO: training starts
2022-07-27 20:24:07
ValueError: throughput not available not enough values to unpack (expected 2, got 1)
2022-07-27 20:24:08
Traceback (most recent call last):
2022-07-27 20:24:08
  File "/u/kastanday/LLM-Distributed-Quantization/benchmarks/gpt/train.py", line 133, in <module>
2022-07-27 20:24:08
    train_gpt()
2022-07-27 20:24:08
  File "/u/kastanday/LLM-Distributed-Quantization/benchmarks/gpt/train.py", line 120, in train_gpt
2022-07-27 20:24:08
    trainer.fit(train_dataloader=train_dataloader,
2022-07-27 20:24:08
  File "/u/kastanday/.conda/envs/nice_base/envs/col_ai_old_v5/lib/python3.9/site-packages/colossalai/trainer/_trainer.py", line 321, in fit
2022-07-27 20:24:08
    self._train_epoch(
2022-07-27 20:24:08
  File "/u/kastanday/.conda/envs/nice_base/envs/col_ai_old_v5/lib/python3.9/site-packages/colossalai/trainer/_trainer.py", line 189, in _train_epoch
2022-07-27 20:24:08
    self._call_hooks("after_train_iter", output=(logits, label, loss))
2022-07-27 20:24:08
  File "/u/kastanday/.conda/envs/nice_base/envs/col_ai_old_v5/lib/python3.9/site-packages/colossalai/trainer/_trainer.py", line 149, in _call_hooks
2022-07-27 20:24:08
    getattr(hook, func)(self, *output)
2022-07-27 20:24:08
  File "/u/kastanday/LLM-Distributed-Quantization/logging/custom_wandb_log_hook.py", line 67, in after_train_iter
2022-07-27 20:24:08
    metrics['samples_per_sec'] = float(samples_per_sec)
2022-07-27 20:24:08
TypeError: float() argument must be a string or a number, not 'list'