Skip to main content

Eleutherai-oslo's group workspace

Timestamps visible
2023-04-19 16:23:51
    36: ParallelTransformerLayerPipe
2023-04-19 16:23:51
    37: ParallelTransformerLayerPipe
2023-04-19 16:23:51
    38: _post_transformer_block
2023-04-19 16:23:51
    39: NormPipe
2023-04-19 16:23:51
    40: ParallelLinearPipe
2023-04-19 16:23:51
  loss: partial
2023-04-19 16:23:51
Configuring Optimizer type: Adam with params: {'lr': 0.0001, 'betas': [0.9, 0.95], 'eps': 1e-08}
2023-04-19 16:23:51
WARNING: APEX not installed - defaulting to deepspeed's fused adam
2023-04-19 16:24:27
Loading extension module fused_adam...
2023-04-19 16:24:27
/fsx/gpt-neox/conda/envs/improved-t5/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2023-04-19 16:24:27
  warnings.warn(
2023-04-19 16:24:28
Time to load fused_adam op: 36.20688462257385 seconds
2023-04-19 16:24:28
> learning rate decay style: cosine
2023-04-19 16:24:28
DeepSpeed is enabled.
2023-04-19 16:24:28
[2023-04-19 16:24:26,597] [INFO] [logging.py:60:log_dist] [Rank 0] DeepSpeed info: version=0.3.15+eb7f5cf, git-hash=eb7f5cf, git-branch=HEAD
2023-04-19 16:24:28
[2023-04-19 16:24:26,597] [WARNING] [config.py:77:_sanity_check] DeepSpeedConfig: cpu_offload is deprecated. Please use offload_optimizer.
2023-04-19 16:24:28
NCCL version 2.14.3+cuda11.7