Apiche's group workspace
test_cumulative_time
What makes this group special?
Tags
test_cumulative_time/finetune
Notes
Author
State
Crashed
Start time
July 29th, 2025 11:25:26 PM
Runtime
6d 16h 1m 47s
Tracked hours
-
Run path
apiche/pipeline-rl/test_cumulative_time_finetune
OS
Linux-5.15.0-1067-nvidia-x86_64-with-glibc2.39
Python version
CPython 3.11.11
Git repository
git clone https://github.com/ServiceNow/PipelineRL-SWE
Git state
git checkout -b "test_cumulative_time/finetune" 6ff6542654d2417495552005f0b5d26a23675792
Command
pipelinerl/entrypoints/run_finetune.py --config-dir results/test_cumulative_time/conf --config-name exp_config output_dir=results/test_cumulative_time hydra.run.dir=results/test_cumulative_time/finetune +me.weight_update_group_init_method=tcp://localhost:9000 +me.weight_update_group_world_size=3 +me.llm_urls=http://localhost:8080+http://localhost:8081
System Hardware
| CPU count | 112 |
| Logical CPU count | 224 |
| GPU count | 4 |
| GPU type | NVIDIA H100 80GB HBM3 |
W&B CLI Version
0.19.11
Group
test_cumulative_timeConfig
Config parameters are your model's inputs. Learn more
- {} 199 keys▶
- "False"
- "no"
- null
- 1
- 64
- 10
- 64
- 64
- "pipelinerl.swe.rollouts.generate_unified_swe_rollout"
- 1
- 50,000,000
- "Please reason step by step, and put your final answer within \boxed{}."
- "{task}"
- 50
- 15,000
- 1
- "nccl"
- "pipelinerl.swe.load_datasets.load_local_swe_dataset"
- "/mnt/llmd/data/swegym/ds"
- "/mnt/llmd/data/swebench_lite/ds"
- "False"
- ""
- true
- null
- false
- "deepspeed_stage3_bf16"
- "DeepSpeedPlugin(hf_ds_config=<accelerate.utils.deepspeed.HfDeepSpeedConfig object at 0x7ffc76f6f090>, gradient_accumulation_steps=1, gradient_clipping='auto', zero_stage=3, is_train_batch_min=True, offload_optimizer_device='none', offload_param_device='none', offload_optimizer_nvme_path='none', offload_param_nvme_path='none', zero3_init_flag=True, zero3_save_16bit_model=True, transformer_moe_cls_names=None, enable_msamp=False, msamp_opt_level='O1')"
- "cuda:0"
- "DistributedType.DEEPSPEED"
- "TorchDynamoPlugin(backend=<DynamoBackend.NO: 'NO'>, mode='default', fullgraph=False, dynamic=False, options=None, disable=False)"
- null
- "pipelinerl.domains.math.MathEnvironment"
- 1,000
- [] 0 items
- 1
- "flash_attention_2"
- false
- "Qwen/Qwen2.5-1.5B-Instruct"
- true
- null
- "tapeagents.finetune.eval.dummy_eval_callback"
- ""
- true
- 512
- true
- 0.3
- 7,777
- 4
- 0
- 1
46 ... 95▶▶96 ... 145▶▶146 ... 194▶▶
Summary
Summary metrics are your model's outputs. Learn more
No summary metrics saved for this run.
Check the summary metrics documentation for more information.