Agent Reinforcement Trainer (ART)

Train, evaluate, and iterate on LLM agents in hours, not weeks

Agent Reinforcement Trainer (ART) is an open-source framework for training agentic LLMs to improve performance and reliability through experience. It provides a lightweight interface to reinforcement learning (RL) algorithms such as Group Relative Policy Optimization (GRPO), boosting model quality while minimizing training costs. ART integrates with W&B Training Serverless RL, eliminating provisioning and management overhead and providing instant access to elastic GPU capacity.

image1
image1

RL made accessible to everyone

Reinforcement learning is still early, and the algorithms are intricate. Small, nuanced mistakes can prevent convergence or cause instability. ART abstracts the complexity behind a developer-friendly API, so you can train models without wrestling with fragile scripts or algorithmic details.

Why ART?

Built for real-world agents

Real user interactions are multi-turn. ART supports multi-turn rollouts so your agent learns from realistic conversations and performs reliably.

Drop‑in integration

OpenAI‑compatible chat endpoint slots straight into your existing code or frameworks like CrewAI, OpenAI Agents SDK, and LangGraph.

Works flexibly with existing code

ART provides wrappers to plug RL training into existing apps and abstracts the training server into a modular service your code needn’t touch.

The Weights & Biases end-to-end AI developer platform

Weave

Traces

Debug agents and AI applications

Evaluations

Rigorous evaluations of agentic AI systems

Playground

Explore prompts
and models

Agents

Observability tools for agentic systems

Guardrails

Block prompt attacks and harmful outputs

Monitors

Continuously improve in prod

Models

Experiments

Track and visualize your ML experiments

Sweeps

Optimize your hyperparameters

Tables

Visualize and explore your ML data

Core

Inference 

Explore hosted, open-source LLMs

Registry

Publish and share your AI models and datasets

Artifacts

Version and manage your AI pipelines

Reports

Document and share your AI insights

SDK

Log AI experiments and artifacts at scale

Automations

Trigger workflows automatically

The Weights & Biases platform helps you streamline your workflow from end to end

Models

Experiments

Track and visualize your ML experiments

Sweeps

Optimize your hyperparameters

Registry

Publish and share your ML models and datasets

Automations

Trigger workflows automatically

Weave

Traces

Explore and
debug LLMs

Evaluations

Rigorous evaluations of GenAI applications

Core

Artifacts

Version and manage your ML pipelines

Tables

Visualize and explore your ML data

Reports

Document and share your ML insights

SDK

Log ML experiments and artifacts at scale

Get started with Guardrails