Skip to main content

Chilli's group workspace

Timestamps visible
2023-08-03 07:46:02
Time to load utils op: 0.5179932117462158 seconds
2023-08-03 07:46:02
[2023-08-03 07:46:00,899] [INFO] [stage1.py:160:__init__] ZeRO Elastic Checkpoint = True
2023-08-03 07:46:02
Time to load utils op: 0.0014071464538574219 seconds
2023-08-03 07:46:04
[2023-08-03 07:46:03,998] [INFO] [engine.py:1551:_load_checkpoint] rank: 40 loading checkpoint: /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38000/mp_rank_00_model_states.pt
2023-08-03 07:48:03
successfully loaded 64 ZeRO state_dicts for rank 40
2023-08-03 07:48:32
loading 64 zero partition checkpoints for rank 40
2023-08-03 07:48:44
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 07:48:46
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 07:48:48
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 07:48:48
> RANK 40 elapsed time for building blendable dataset indices: 0.66 (sec)
2023-08-03 07:48:51
> RANK 40 elapsed time for building blendable dataset indices: 1.01 (sec)
2023-08-03 07:48:51
> RANK 40 elapsed time for building blendable dataset indices: 1.09 (sec)
2023-08-03 07:49:21
[2023-08-03 07:49:20,703] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/lintangsutawika/checkpoints/temp_neox_models/zero_to_fp32.py
2023-08-03 07:49:21
[2023-08-03 07:49:20,714] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38001/zero_pp_rank_40_mp_rank_00_optim_states.pt
2023-08-03 07:49:37
[2023-08-03 07:49:35,778] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/lintangsutawika/checkpoints/temp_neox_models/zero_to_fp32.py
2023-08-03 07:49:37
[2023-08-03 07:49:35,943] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38002/zero_pp_rank_40_mp_rank_00_optim_states.pt