Skip to main content

Eleutherai-oslo's group workspace

Timestamps visible
2022-10-01 03:21:48
[2022-10-01 03:21:46,989] [INFO] [stage1.py:697:step] [deepspeed] fp16 dynamic loss scale overflow! Skipping step. Attempted loss scale: 65536.0, reducing to 32768.0
2022-10-01 03:36:06
[2022-10-01 03:36:05,024] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/multi-lingual-6b/gpt-neox/checkpoints/6B_scratch/zero_to_fp32.py
2022-10-01 03:36:06
[2022-10-01 03:36:05,365] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/multi-lingual-6b/gpt-neox/checkpoints/6B_scratch/global_step319000/zero_pp_rank_40_mp_rank_02_optim_states.pt
2022-10-01 04:12:01
[2022-10-01 04:11:59,666] [INFO] [stage1.py:697:step] [deepspeed] fp16 dynamic loss scale overflow! Skipping step. Attempted loss scale: 65536.0, reducing to 65536.0
2022-10-01 04:12:05
[2022-10-01 04:12:05,489] [INFO] [stage1.py:697:step] [deepspeed] fp16 dynamic loss scale overflow! Skipping step. Attempted loss scale: 65536.0, reducing to 32768.0
2022-10-01 04:17:46
[2022-10-01 04:17:44,874] [INFO] [stage1.py:697:step] [deepspeed] fp16 dynamic loss scale overflow! Skipping step. Attempted loss scale: 32768.0, reducing to 16384.0
2022-10-01 04:25:57
[2022-10-01 04:25:56,608] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/multi-lingual-6b/gpt-neox/checkpoints/6B_scratch/zero_to_fp32.py
2022-10-01 04:25:59
[2022-10-01 04:25:56,751] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/multi-lingual-6b/gpt-neox/checkpoints/6B_scratch/global_step320000/zero_pp_rank_40_mp_rank_02_optim_states.pt
2022-10-01 04:26:23
[2022-10-01 04:26:22,085] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/multi-lingual-6b/gpt-neox/checkpoints/6B_scratch/zero_to_fp32.py
2022-10-01 04:26:23
[2022-10-01 04:26:22,249] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/multi-lingual-6b/gpt-neox/checkpoints/6B_scratch/global_step320000/zero_pp_rank_40_mp_rank_02_optim_states.pt