Skip to main content

Apiche's group workspace

debug_actor_mcp_tir

What makes this group special?
Tags

debug_actor_mcp_tir/finetune

Notes
Author
State
Crashed
Start time
August 18th, 2025 2:56:59 PM
Runtime
2h 36m 1s
Tracked hours
-
Run path
apiche/pipeline-rl/debug_actor_mcp_tir_finetune
OS
Linux-5.15.0-1067-nvidia-x86_64-with-glibc2.39
Python version
CPython 3.11.11
Git repository
git clone git@github.com:ServiceNow/pipelinerl.git
Git state
git checkout -b "debug_actor_mcp_tir/finetune" d2e6d09deb3a2efddd79d8e42d14ac2ca8e6101f
Command
pipelinerl/entrypoints/run_finetune.py --config-dir results/debug_actor_mcp_tir/conf --config-name exp_config output_dir=results/debug_actor_mcp_tir hydra.run.dir=results/debug_actor_mcp_tir/finetune +me.weight_update_group_init_method=tcp://localhost:9000 +me.weight_update_group_world_size=3 +me.llm_urls=http://localhost:8080+http://localhost:8081
System Hardware
CPU count112
Logical CPU count 224
GPU count4
GPU typeNVIDIA H100 80GB HBM3
W&B CLI Version
0.19.11
Config

Config parameters are your model's inputs. Learn more

  • {} 207 keys
    • "False"
    • "no"
    • null
    • 1
    • 64
    • 0
    • 64
    • 64
    • "pipelinerl.domains.tir_mcp.generate_math_rollout2"
    • 1
    • 10,000,000
    • "Please reason step by step, and put your final answer within \boxed{}."
    • "{task}"
    • 50
    • 3
    • "tapeagents.agent.Agent"
    • 5
    • "mcp_agent"
    • [] 5 items
      • true
      • "You have access to the following tools: {tools_description} "
      • "You have access to the following tools: {tools_description} "
      • "Output only a single JSON dict. Do not repeat the last thought again. If the last action does not change the observation, do not repeat it! DO NOT OUTPUT ANYTHING BESIDES THE JSON! DO NOT PLACE ANY COMMENTS INSIDE THE JSON. It will break the system that processes the output. "
      • "You are an expert AI Agent trained to assist users with complex information processing tasks. Your role is to understand user queries and respond in a helpful and accurate manner. Keep your replies concise and direct. Prioritize clarity and avoid over-elaboration. Do not express emotions or opinions about user questions. You must use the python tool for computation. "
      • "Important! Respond with the plain text, do not include any JSON or code. Do not output anything besides what I asked in this message. "
      • 1
      • "nccl"
      • "pipelinerl.domains.math.load_datasets"
      • "False"
      • ""
      • true
      • null
      • false
      • "deepspeed_stage3_bf16"
      • "DeepSpeedPlugin(hf_ds_config=<accelerate.utils.deepspeed.HfDeepSpeedConfig object at 0x7ffc769b7910>, gradient_accumulation_steps=1, gradient_clipping='auto', zero_stage=3, is_train_batch_min=True, offload_optimizer_device='none', offload_param_device='none', offload_optimizer_nvme_path='none', offload_param_nvme_path='none', zero3_init_flag=True, zero3_save_16bit_model=True, transformer_moe_cls_names=None, enable_msamp=False, msamp_opt_level='O1')"
      • "cuda:0"
      • "DistributedType.DEEPSPEED"
      • "TorchDynamoPlugin(backend=<DynamoBackend.NO: 'NO'>, mode='default', fullgraph=False, dynamic=None, options=None, disable=False, use_regional_compilation=False)"
      • "pipelinerl.domains.tir_mcp.env_server.MCPEnvironmentServer"
      • "results/debug_actor_mcp_tir/env_server"
      • "localhost"
      • "pipelinerl.domains.math.MathEnvironment"
      • "/home/toolkit/research-now-reasoner/pipelinerl/conf/mcp/python.json"
      • "tapeagents.mcp.MCPEnvironment"
      • [] 1 item
        • "run_python_code"
      • 8
      • 46 ... 95
        96 ... 145
        146 ... 195
        196 ... 202
      • 7,777
      • 4
      • 0
      • 1
    Summary

    Summary metrics are your model's outputs. Learn more

    No summary metrics saved for this run.

    Check the summary metrics documentation for more information.