Skip to main content

Chilli's group workspace

Timestamps visible
2023-08-03 08:21:39
Time to load utils op: 0.515296459197998 seconds
2023-08-03 08:21:39
[2023-08-03 08:21:37,752] [INFO] [stage1.py:160:__init__] ZeRO Elastic Checkpoint = True
2023-08-03 08:21:39
Time to load utils op: 0.0011751651763916016 seconds
2023-08-03 08:21:41
[2023-08-03 08:21:40,804] [INFO] [engine.py:1551:_load_checkpoint] rank: 16 loading checkpoint: /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38000/mp_rank_00_model_states.pt
2023-08-03 08:23:28
successfully loaded 64 ZeRO state_dicts for rank 16
2023-08-03 08:23:56
loading 64 zero partition checkpoints for rank 16
2023-08-03 08:24:10
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 08:24:14
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 08:24:16
WARNING: shuffle index length (162165685) is not equal to sample index length (162165686)
2023-08-03 08:24:16
> RANK 16 elapsed time for building blendable dataset indices: 0.58 (sec)
2023-08-03 08:24:18
> RANK 16 elapsed time for building blendable dataset indices: 0.98 (sec)
2023-08-03 08:24:18
> RANK 16 elapsed time for building blendable dataset indices: 1.06 (sec)
2023-08-03 08:24:49
[2023-08-03 08:24:47,931] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/lintangsutawika/checkpoints/temp_neox_models/zero_to_fp32.py
2023-08-03 08:24:49
[2023-08-03 08:24:48,004] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38001/zero_pp_rank_16_mp_rank_00_optim_states.pt
2023-08-03 08:25:03
[2023-08-03 08:25:02,217] [INFO] [engine.py:1805:_copy_recovery_script] creating recovery script /fsx/lintangsutawika/checkpoints/temp_neox_models/zero_to_fp32.py
2023-08-03 08:25:03
[2023-08-03 08:25:02,234] [INFO] [engine.py:1818:_save_zero_checkpoint] zero checkpoint saved /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38002/zero_pp_rank_16_mp_rank_00_optim_states.pt