Skip to main content

Chilli's group workspace

Timestamps visible
2023-08-03 05:15:53
make: Leaving directory '/fsx/lintangsutawika/01-project-pythia/gpt-neox/megatron/data'
2023-08-03 05:15:59
WARNING: APEX not installed - defaulting to deepspeed's fused adam
2023-08-03 05:15:59
Time to load fused_adam op: 0.40450501441955566 seconds
2023-08-03 05:15:59
[2023-08-03 05:15:57,588] [WARNING] [config.py:77:_sanity_check] DeepSpeedConfig: cpu_offload is deprecated. Please use offload_optimizer.
2023-08-03 05:15:59
Using ./extensions/ as PyTorch extensions root...
2023-08-03 05:15:59
Loading extension module fused_adam...
2023-08-03 05:15:59
/fsx/lintangsutawika/miniconda3/envs/pythia/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2023-08-03 05:15:59
  warnings.warn(
2023-08-03 05:16:01
Using ./extensions/ as PyTorch extensions root...
2023-08-03 05:16:01
Loading extension module utils...
2023-08-03 05:16:01
Time to load utils op: 0.505319356918335 seconds
2023-08-03 05:16:01
[2023-08-03 05:16:00,911] [INFO] [stage1.py:160:__init__] ZeRO Elastic Checkpoint = True
2023-08-03 05:16:03
Time to load utils op: 0.00141143798828125 seconds
2023-08-03 05:16:03
Using ./extensions/ as PyTorch extensions root...
2023-08-03 05:16:03
No modifications detected for re-loaded extension module utils, skipping build step...
2023-08-03 05:16:03
Loading extension module utils...
2023-08-03 05:16:05
[2023-08-03 05:16:04,096] [INFO] [engine.py:1551:_load_checkpoint] rank: 8 loading checkpoint: /fsx/lintangsutawika/checkpoints/temp_neox_models/global_step38005/mp_rank_00_model_states.pt