Skip to main content

Eleutherai-oslo's group workspace

Timestamps visible
2022-11-15 16:06:01
}
2022-11-15 16:06:01
Using /fsx/multi-lingual-6b/torch_extensions/ as PyTorch extensions root...
2022-11-15 16:06:01
No modifications detected for re-loaded extension module utils, skipping build step...
2022-11-15 16:06:01
Loading extension module utils...
2022-11-15 16:06:01
Time to load utils op: 0.0009810924530029297 seconds
2022-11-15 16:06:02
Traceback (most recent call last):
2022-11-15 16:06:02
  File "/fsx/multi-lingual-6b/gpt-neox/train.py", line 27, in <module>
2022-11-15 16:06:02
    pretrain(neox_args=neox_args)
2022-11-15 16:06:02
  File "/fsx/multi-lingual-6b/gpt-neox/megatron/training.py", line 103, in pretrain
2022-11-15 16:06:02
    model, optimizer, lr_scheduler = setup_model_and_optimizer(
2022-11-15 16:06:02
  File "/fsx/multi-lingual-6b/gpt-neox/megatron/training.py", line 451, in setup_model_and_optimizer
2022-11-15 16:06:02
    model, optimizer, _, lr_scheduler = deepspeed.initialize(
2022-11-15 16:06:02
  File "/fsx/gpt-neox/conda/envs/improved-t5/lib/python3.9/site-packages/deepspeed/__init__.py", line 128, in initialize
2022-11-15 16:06:02
    engine = PipelineEngine(args=args,
2022-11-15 16:06:02
  File "/fsx/gpt-neox/conda/envs/improved-t5/lib/python3.9/site-packages/deepspeed/runtime/pipe/engine.py", line 63, in __init__
2022-11-15 16:06:02
    assert self.zero_optimization_stage() < 2, "ZeRO-2 and ZeRO-3 are incompatible with pipeline parallelism"
2022-11-15 16:06:02
AssertionError: ZeRO-2 and ZeRO-3 are incompatible with pipeline parallelism