Skip to main content

Chilli's group workspace

Timestamps visible
2023-08-03 05:27:13
Time to load utils op: 0.5037755966186523 seconds
2023-08-03 05:27:13
[2023-08-03 05:27:11,959] [INFO] [stage1.py:160:__init__] ZeRO Elastic Checkpoint = True
2023-08-03 05:27:13
Time to load utils op: 0.0037970542907714844 seconds
2023-08-03 05:27:15
[2023-08-03 05:27:15,066] [INFO] [engine.py:1551:_load_checkpoint] rank: 16 loading checkpoint: /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38002/mp_rank_00_model_states.pt
2023-08-03 05:29:06
successfully loaded 64 ZeRO state_dicts for rank 16
2023-08-03 05:29:38
loading 64 zero partition checkpoints for rank 16
2023-08-03 05:30:22
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 05:30:24
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 05:30:28
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 05:30:28
> RANK 16 elapsed time for building blendable dataset indices: 0.94 (sec)
2023-08-03 05:30:30
> RANK 16 elapsed time for building blendable dataset indices: 1.24 (sec)
2023-08-03 05:30:30
> RANK 16 elapsed time for building blendable dataset indices: 1.27 (sec)
2023-08-03 05:31:01
[2023-08-03 05:31:00,992] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/lintangsutawika/checkpoints/temp_neox_models/zero_to_fp32.py
2023-08-03 05:31:01
[2023-08-03 05:31:01,057] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38003/zero_pp_rank_16_mp_rank_00_optim_states.pt
2023-08-03 05:31:15
[2023-08-03 05:31:15,621] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/lintangsutawika/checkpoints/temp_neox_models/zero_to_fp32.py
2023-08-03 05:31:15
[2023-08-03 05:31:15,626] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38004/zero_pp_rank_16_mp_rank_00_optim_states.pt