Skip to main content

Eleutherai-oslo's group workspace

Timestamps visible
2023-04-20 01:54:41
    37: ParallelTransformerLayerPipe
2023-04-20 01:54:41
    38: _post_transformer_block
2023-04-20 01:54:41
    39: NormPipe
2023-04-20 01:54:41
    40: ParallelLinearPipe
2023-04-20 01:54:41
  loss: partial
2023-04-20 01:54:41
Configuring Optimizer type: Adam with params: {'lr': 0.0001, 'betas': [0.9, 0.95], 'eps': 1e-08}
2023-04-20 01:54:41
WARNING: APEX not installed - defaulting to deepspeed's fused adam
2023-04-20 01:54:43
Using /fsx/polyglot.train/torch_extensions/ as PyTorch extensions root...
2023-04-20 01:55:13
Loading extension module fused_adam...
2023-04-20 01:55:13
/fsx/kevin.ai/Anaconda/envs/polyglot/lib/python3.9/site-packages/torch/distributed/distributed_c10d.py:429: UserWarning: torch.distributed.distributed_c10d._get_global_rank is deprecated please use torch.distributed.distributed_c10d.get_global_rank instead
2023-04-20 01:55:13
  warnings.warn(
2023-04-20 01:55:13
Time to load fused_adam op: 31.098730087280273 seconds
2023-04-20 01:55:13
> learning rate decay style: cosine
2023-04-20 01:55:13
DeepSpeed is enabled.
2023-04-20 01:55:13
[2023-04-20 01:55:12,042] [INFO] [logging.py:60:log_dist] [Rank 0] DeepSpeed info: version=0.3.15+unknown, git-hash=unknown, git-branch=unknown
2023-04-20 01:55:13
[2023-04-20 01:55:12,042] [WARNING] [config.py:77:_sanity_check] DeepSpeedConfig: cpu_offload is deprecated. Please use offload_optimizer.
2023-04-20 01:55:13
NCCL version 2.14.3+cuda11.7