Build and monitor AI Agents with deepset and Weights & Biases
Learn how the new deepset integration with Weights & Biases works—and how it can make building agentic AI systems a whole lot easier
Created on April 17|Last edited on April 24
Comment
This post was authored by Loretta Shen and Agnieszka Marzec from deepset
💡
When building AI-powered applications, one of the most significant challenges is understanding how exactly they process inputs and arrive at outputs. AI systems, particularly those with complex workflows and multiple components, can often be "black boxes." You might see the output, but the internal processes are hard to track.
This lack of visibility creates real problems:
- Behavior is unpredictable: How can you trust the decisions your agent makes?
- Debugging becomes a guessing game: When something goes wrong, where did it fail?
- Optimization is difficult: Which parts of your AI system are creating bottlenecks?
Thankfully, the new deepset integration with Weights & Biases Weave can help. Here's what we'll be covering today:
Introducing deepset’s Weights & Biases Weave integration
That's where the new integration between the deepset AI Platform and Weights & Biases Weave comes in. Here’s how the collaboration works:
The deepset AI Platform is a toolset for building production-ready AI applications. Powered by Haystack, deepset’s open-source Python framework, it provides a modular approach to developing AI applications and agents.
The deepset AI Platform uses Haystack's Compound AI approach to building AI applications, which combines components into pipelines of varying size and complexity. Pipelines define the logic of your application and how information flows through it. They range from deterministic, predefined workflows to flexible agents with the ability to dynamically invoke tools and choose from multiple paths of action.
Once your pipeline is running in deepset, you can test it, gather feedback, and monitor outputs. But like with any production system, you need detailed visibility into what’s happening under the hood to ensure reliability and performance.
W&B Weave helps developers evaluate, monitor, and iterate on agentic AI applications. A core capability of Weave is detailed tracing, with Weave Traces creating a detailed log of requests as they flow through your AI application. Tracing is essential for monitoring your application in production and documenting component interactions and data transformations in real-time so you can pinpoint performance bottlenecks and quickly diagnose issues as they arise.
With the deepset and W&B Weave pre-built connector, you can monitor your deepset pipelines in Weave and understand how your pipeline runs, making debugging easier. Together, they create a powerful combination for building, monitoring, and debugging AI applications and agents.
In this tutorial, we'll show you how to gain full visibility into your pipelines using deepset AI Platform and W&B Weave. You’ll learn how to build and trace an AI agent with rich, structured observability – turning a “black box” into a transparent, trustworthy system you can confidently deploy in production.
Step 1: Creating your AI pipeline
The deepset AI Platform makes it easy to get started with pre-built Agent and Application Templates. These templates provide a pre-configured pipeline with core agent functionality out of the box, allowing you to focus on customizing and extending the pipeline rather than building from scratch. In this example, we’ll use our deepset Research Agent template. It has two tools at its disposal: a RAG pipeline that searches a local knowledge base and a web search component that looks for answers online to augment local sources.

Note: if you’d like to develop on your own, you can follow this tutorial on building a tool-calling agent in Haystack
💡
The template uses our new Agent component, which makes decisions about which tool to use based on the user's query and context. The Agent component has 3 core functions:
- A chat model: Supplied through an underlying ChatGenerator, which processes and generates text. The Agent is provider-agnostic, so it can work with any model. In this case, the model used is Anthropic Claude 3.7.
- A list of tools: These can be custom tools for a specific use case, pipeline components, or pipelines. This template calls on a RAG pipeline and a websearch component as tools.
- An exit condition: Defined using exit_condition. The Agent runs iteratively, calling tools and feeding their outputs back to the model until this condition is met. In this case, the exit condition is met once the model returns a text response.

Step 2: Connecting deepset to Weave
You need an API key from Weave to trace your deepset pipelines there. Head over to the Weights & Biases signup page and set up your account there—it takes just a few minutes and you’re ready. You'll see your API key on your Weave account home page which is all you'll need. Now, let’s connect deepset AI Platform to Weave.

To do this in deepset, click your initials to open the menu, and click "Connections."

Next, scroll down the list to find Weights & Biases, click Connect, and paste your API key.

That's it! The deepset AI Platform is now connected to Weave. You’re ready to trace.
Step 3: Adding tracing to your pipeline
The last step to set up the tracing is adding the WeaveConnector component in your pipeline. Its task is to collect the trace data and send them to Weave where you can view and analyze them. To do this, open your pipeline for editing in Pipeline Builder:

In the Component Library, open the "Connectors group" and drag WeaveConnector onto the canvas.

Last, set the pipeline name in the component. This becomes the name of your Weave tracing project. You could use any name here, but it’s helpful to match the pipeline name to keep things organized if you’re tracing multiple pipelines.

Tracing is all set up now. The connector doesn’t need to be wired to anything else in the pipeline. It works quietly in the background. Now, it’s time to save and deploy your pipeline for testing.
Step 4: Testing your AI pipeline
Once your pipeline is deployed, you can test and refine it in deepset’s Playground. You can interact with your AI pipeline, explore different configurations, watch how it processes queries in real-time, and leave feedback on responses for optimizations.

The built-in debug mode provides visibility into component execution, letting you see parameters, properties, prompts, and documents for the current session.
To take your observability to the next level, especially for complex agentic systems, the Weave integration adds powerful complementary capabilities.
With Weave, you gain additional granular insights into your pipelines inner workings including how the Agent decides which tool to use, what data flows through the components, token usage metrics, LLM call cost, and detailed execution paths.
Step 5: Tracing in Weave
The moment you run the first query, Weave automatically creates a project using the name set in WeaveConnector.

Open the project and head to "Traces."

Here, you can see the inputs and outputs of each component:

You can also check the Agent messages:

In the main view, you can also see the number of tokens used, the latency, and the costs incurred:

Each haystack.[pipeline.run]event equals the user asking a query. The pipeline invokes the run method and the data starts flowing through each component. You can see the inputs and outputs of each component at runtime.

For example, you can see that for a particular query, the generator (that’s the LLM powering the Agent) used a tool called “local_search” and finished after that. You can see the complete answer, the arguments it ran the tool with, the number of tokens generated, and more.

Weave gives you a clear view of what each component does for every query. All activity is versioned and timestamped so it’s easy to keep track of changes.
Wrapping up
With deepset’s modular approach to building AI applications and Weave’s comprehensive tracing capabilities, developers can create more reliable, transparent AI systems.
By integrating W&B Weave with your deepset pipelines, you gain:
- Deep observability for every pipeline run – from component behavior and status to parameters and results
- Detailed tracing for easier debugging and iteration
- Clear visibility into agentic behavior
Try it out today and let us know what you think!
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.