Skip to main content

Apiche's group workspace

Timestamps visible
2025-10-02 17:55:23
        "stage3_gather_16bit_weights_on_model_save": true
2025-10-02 17:55:23
    },
2025-10-02 17:55:23
    "gradient_accumulation_steps": 4,
2025-10-02 17:55:23
    "gradient_clipping": 0.3,
2025-10-02 17:55:23
    "steps_per_print": inf,
2025-10-02 17:55:23
    "train_batch_size": 16,
2025-10-02 17:55:23
    "train_micro_batch_size_per_gpu": 1,
2025-10-02 17:55:23
    "wall_clock_breakdown": false,
2025-10-02 17:55:23
    "fp16": {
2025-10-02 17:55:23
        "enabled": false
2025-10-02 17:55:23
    },
2025-10-02 17:55:23
    "zero_allow_untested_optimizer": true
2025-10-02 17:55:23
}
2025-10-02 17:55:23
[finetune]: 10/02/2025 17:55:23.834 - INFO - pipelinerl.finetune_loop - Model, optimizer and lr_scheduler prepared
2025-10-02 17:55:23
[finetune]: 10/02/2025 17:55:23.835 - INFO - pipelinerl.finetune_loop - Model class is <class 'deepspeed.runtime.engine.DeepSpeedEngine'>, optimizer class is <class 'accelerate.utils.deepspeed.DeepSpeedOptimizerWrapper'>, lr_scheduler class is <class 'accelerate.scheduler.AcceleratedScheduler'>
2025-10-02 17:55:23
[finetune]: 10/02/2025 17:55:23.836 - INFO - pipelinerl.finetune_loop - After accelerator.prepare() the optimizer's parameters had dtypes set()
2025-10-02 17:55:23
[finetune]: 10/02/2025 17:55:23.838 - INFO - pipelinerl.finetune_loop - Initializing actor process group