apiche

Apiche's group workspace

Group: debug_gspo4

1-3

of 3

Tags

Notes

Author

apiche

State

Crashed

Start time

September 27th, 2025 7:40:37 PM

Runtime

14m 5s

Tracked hours

Run path

apiche/pipeline-rl/debug_gspo4_actor

Linux-5.15.0-1067-nvidia-x86_64-with-glibc2.39

Python version

CPython 3.11.11

Git repository

git clone git@github.com:ServiceNow/pipelinerl.git

Git state

git checkout -b "debug_gspo4/actor" 6adb0e8c446488abbd23a8176674cb56f03e8616

Command

-m pipelinerl.entrypoints.run_actor --config-dir results/debug_gspo4/conf --config-name exp_config output_dir=results/debug_gspo4 hydra.run.dir=results/debug_gspo4/actor +me.llm_urls=http://localhost:8080+http://localhost:8081+http://localhost:8082+http://localhost:8083

System Hardware

CPU count	112
Logical CPU count	224
GPU count	8
GPU type	NVIDIA H100 80GB HBM3

W&B CLI Version

0.19.11

Group

debug_gspo4

Config parameters are your model's inputs. Learn more

▶
Config parameters:{} 169 keys
- accelerate_config:
  null
- actor.discount_factor:
  1
- actor.llm_max_rollouts:
  64
- actor.log_each_n_secs:
  0
- actor.problem_queue_size:
  64
- actor.result_queue_size:
  64
- actor.rollout_policy:
  "pipelinerl.domains.math.generate_math_rollout"
- actor.rollout_workers:
  1
- actor.shared_memory_entry_size:
  10,000,000
- actor.system_prompt:
  "Please reason step by step, and put your final answer within \boxed{}."
- actor.task_prompt:
  ""
- actor.throughput_window_size:
  50
- attempts:
  1
- dataset_loader:
  "pipelinerl.domains.math.load_datasets"
- debug.mode:
  ""
- debug.place_inference_workers:
  true
- debug.streams_from:
  null
- debug.use_existing_llms:
  false
- deepspeed_config:
  "deepspeed_stage3_bf16"
- environment._target_:
  "pipelinerl.domains.math.MathEnvironment"
- eval_every_n_versions:
  0
- finetune.also_save_steps:[] 0 items
- finetune.attempts:
  1
- finetune.attn_implementation:
  "flash_attention_2"
- finetune.auto_device_map:
  false
- finetune.config_name:
  "Qwen/Qwen2.5-0.5B"
- finetune.cuda_empty_cache:
  true
- finetune.data:
  null
- finetune.eval_callback._target_:
  "tapeagents.finetune.eval.dummy_eval_callback"
- finetune.eval_callback.config_name:
  ""
- finetune.force_restart:
  true
- finetune.gradient_accumulation_passes:
  1,024
- finetune.gradient_checkpointing:
  true
- finetune.gradient_clipping_threshold:
  0.3
- finetune.input:
  "training_data"
- finetune.interrupt_train_steps:
  -1
- finetune.keep_intermediate_checkpoints:
  true
- finetune.learning_rate:
  0.000001
- finetune.load_as_bf16:
  true
- finetune.log_each_n_steps:
  1
- finetune.lora.alpha:
  16
- finetune.lora.base_model_4bit:
  false
- finetune.lora.base_model_8bit:
  false
- finetune.lora.bias:
  "none"
- finetune.lora.dropout:
  0.05
- finetune.lora.enabled:
  false
- world.environment_start_port:
  7,777
- world.finetune_fraction:
  1
- world.preprocessor_fraction:
  0
- world.replicas:
  1

Summary metrics are your model's outputs. Learn more

No summary metrics saved for this run.

Check the summary metrics documentation for more information.