Skip to main content

Apiche's group workspace

Timestamps visible
2025-09-27 18:49:32
        "stage3_gather_16bit_weights_on_model_save": true
2025-09-27 18:49:32
    },
2025-09-27 18:49:32
    "gradient_accumulation_steps": 256,
2025-09-27 18:49:32
    "gradient_clipping": 0.3,
2025-09-27 18:49:32
    "steps_per_print": inf,
2025-09-27 18:49:32
    "train_batch_size": 1.024000e+03,
2025-09-27 18:49:32
    "train_micro_batch_size_per_gpu": 1,
2025-09-27 18:49:32
    "wall_clock_breakdown": false,
2025-09-27 18:49:32
    "fp16": {
2025-09-27 18:49:32
        "enabled": false
2025-09-27 18:49:32
    },
2025-09-27 18:49:32
    "zero_allow_untested_optimizer": true
2025-09-27 18:49:32
}
2025-09-27 18:49:32
[finetune]: 09/27/2025 18:49:32.377 - INFO - pipelinerl.finetune_loop - Model, optimizer and lr_scheduler prepared
2025-09-27 18:49:32
[finetune]: 09/27/2025 18:49:32.378 - INFO - pipelinerl.finetune_loop - Model class is <class 'deepspeed.runtime.engine.DeepSpeedEngine'>, optimizer class is <class 'accelerate.utils.deepspeed.DeepSpeedOptimizerWrapper'>, lr_scheduler class is <class 'accelerate.scheduler.AcceleratedScheduler'>
2025-09-27 18:49:32
[finetune]: 09/27/2025 18:49:32.378 - INFO - pipelinerl.finetune_loop - After accelerator.prepare() the optimizer's parameters had dtypes set()
2025-09-27 18:49:32
[finetune]: 09/27/2025 18:49:32.378 - INFO - pipelinerl.finetune_loop - Initializing actor process group