Skip to main content

Monitoring Amazon Bedrock Agents with W&B Weave

Learn to build and monitor powerful AI agents with Amazon Bedrock and W&B Weave for automated workflows.
Created on December 23|Last edited on March 1
Amazon Bedrock Agents are a powerful tool for automating complex tasks by combining foundational models, APIs, and knowledge bases. These agents enable organizations to streamline workflows, respond to user queries, and integrate AI capabilities into existing systems.
This article will explore what Bedrock Agents are, how they function, and how to enable tracking their internal steps and decision making using W&B Weave. Is you're already familiar with Amazon Bedrock Agents and simply want to learn how to get started with monitoring, just click the button below.
Jump to tutorial




Table of contents



What are Amazon Bedrock Agents?

Amazon Bedrock Agents are AI-driven systems that handle intricate tasks by interacting with foundation models, APIs, and other data sources, enabling them to reason through user queries, retrieve information, and perform actions to accomplish specific goals.
These capabilities streamline workflows, improve operational efficiency, and enable more accurate responses, calculations, and multi-system workflows. To achieve this, agents utilize various tools to interact with external systems and knowledge bases. Additionally, agents can utilize memory to retain context from past interactions, leading to more coherent and personalized user experiences.
By seamlessly integrating foundational models, prompt templates, knowledge bases, action groups, and memory, Amazon Bedrock Agents provide a robust framework for automating complex workflows and enhancing the efficiency of AI-driven applications. They can even handle unexpected situations or errors, ensuring smooth operation and reliable performance.

How do Amazon Bedrock Agents work?

Amazon Bedrock Agents operate by processing user input through a series of steps orchestrated by a foundational AI model. This model, trained on a massive dataset, understands the nuances of human language and guides the agent's actions.
Upon receiving a request, the agent uses prompt templates to structure the information needed. It then consults its knowledge base, a repository of relevant information, to retrieve context and details. If external data is required, the agent leverages action groups to interact with APIs, enabling it to perform tasks like fetching real-time data, updating records, or sending notifications.
Throughout this process, the agent can utilize an optional memory feature to retain information from previous interactions, allowing for more coherent and personalized responses. This memory helps the agent maintain context and understand the user's needs better over time.
By seamlessly integrating foundational models, prompt templates, knowledge bases, action groups, and memory, Amazon Bedrock Agents provide a robust framework for automating complex workflows and enhancing the efficiency of AI-driven applications. They can even handle unexpected situations or errors, ensuring smooth operation and reliable performance.

Why do you need to monitor Bedrock Agents?

Amazon Bedrock Agents are complex systems that rely on a multitude of components to function effectively, including foundation models, action groups, knowledge bases, and prompt templates. As a result, there are numerous potential points of failure that can impact an agent's performance, ranging from issues with the foundation model itself to problems with knowledge retrieval, tool integration, or decision-making logic. Any issue within these components can impact the agent's ability to reason, retrieve information, and make decisions, potentially leading to suboptimal outcomes.
Fortunately, Bedrock Agents offer comprehensive monitoring capabilities, allowing you to track the agent's "thought process" at each step. This includes monitoring API calls, data retrieved from knowledge bases, and the logic behind every decision made. By examining these detailed traces, you can gain valuable insights into how the agent utilizes its resources, identify bottlenecks, and optimize its performance. This level of visibility empowers you to proactively address issues, refine the agent's design, and ensure it consistently delivers the best possible results for your users.

W&B Weave for agent monitoring

Monitoring the intricate workings of Amazon Bedrock Agents can be challenging, but W&B Weave provides a powerful solution. This library simplifies the logging process, allowing you to gain deep insights into your agent's behavior and performance.
By integrating Weave into your agent's workflow, you can automatically capture detailed information about each step, including the specific API calls made, the data retrieved from knowledge bases, and the model's intermediate outputs. The @weave.op decorator makes it easy to track your Python functions, automatically logging inputs, outputs, and even code changes.
Weave complements Bedrock's built-in tracing capabilities, providing a comprehensive view of the agent's activity. This data can be easily visualized and analyzed within Weave's platform, helping you identify bottlenecks, debug issues, and optimize performance. You can even leverage an LLM to summarize complex trace objects, making it easier to understand the agent's decision-making process and identify areas for improvement.
With Weave, you gain the visibility and insights needed to make data-driven decisions, refine your agent's design, and ensure it delivers the best possible outcomes for your users.

Tutorial: Monitoring Bedrock Agents with W&B Weave

In this tutorial, we will build and monitor an agent using Amazon Bedrock Agents and W&B Weave. Our agent will be designed to handle queries such as, "What is the weather at each of my stores?" by combining dynamic weather data with a pre-defined knowledge base of store locations.
The agent will use tools to integrate with APIs for fetching weather information and leverage a knowledge base containing the locations of the stores to provide relevant, context-aware responses. This capability will demonstrate the agent’s ability to connect user queries to actionable, data-driven answers.

Step One: Environment Setup

As with any AWS project, an AWS account is required. To access Bedrock Agents, you must configure your IAM permissions correctly, either by using an IAM Admin role (separate from the root account) or by attaching the AmazonBedrockFullAccess policy to a role which will use your agent. To simplify the process of creating an Admin account, I’ve provided a script that you can use, which I’ll link here.
Additionally you will need to install and configure boto3 on your local system. For more details on how to setup boto3, check out my other article here, which will enable you to setup boto3. Here are the pip packages you will want to install for this tutorial:
pip install boto3 weave

Step Two: Creating "Tools" and a "Knowledge Base"

Once permissions are configured, we will proceed to create the foundational resources for our agent. These include a "tool" that allows the agent to check the weather for any location dynamically and a knowledge base containing the locations of our stores. These components enable the agent to combine weather data and store locations to provide accurate and relevant responses to user queries.
First, we will create an S3 bucket, which will contain a text document that serves as the knowledge base for our Bedrock agent. This text document will list the locations of our stores, formatted in a way that the agent can easily reference. The knowledge base is a crucial component that allows the agent to provide meaningful and context-aware responses, such as determining the weather for each store location.
Here’s the script that will automate this step:
import boto3
import botocore.exceptions

# Initialize AWS clients
bedrock = boto3.client('bedrock', region_name='us-east-2') # Adjust region as necessary
s3 = boto3.client('s3', region_name='us-east-2')

# Step 1: Create the Text File
def create_text_file(file_path):
print("Creating text file...")
data = [
"Store: Walmart, City: New York",
"Store: Target, City: Los Angeles",
"Store: Costco, City: Chicago",
"Store: Best Buy, City: Houston",
"Store: IKEA, City: Seattle"
]
with open(file_path, "w") as file:
file.write("\n".join(data))
print(f"Text file created: {file_path}")

# Step 2: Create S3 Bucket (if it doesn't exist)
def create_bucket(bucket_name, region='us-east-2'):
try:
print(f"Checking if bucket '{bucket_name}' exists...")
s3.head_bucket(Bucket=bucket_name)
print(f"Bucket '{bucket_name}' already exists.")
except botocore.exceptions.ClientError:
print(f"Bucket '{bucket_name}' does not exist. Creating it...")
s3.create_bucket(
Bucket=bucket_name,
CreateBucketConfiguration={'LocationConstraint': region}
)
print(f"Bucket '{bucket_name}' created successfully.")

# Step 3: Upload the Text File to S3
def upload_file_to_s3(bucket_name, file_path, s3_key):
print("Uploading file to S3...")
s3.upload_file(file_path, bucket_name, s3_key)
print(f"File uploaded successfully: s3://{bucket_name}/{s3_key}")
return f"s3://{bucket_name}/{s3_key}"

# Step 4: Main Function to Orchestrate the Process
def main():
# Update these variables
bucket_name = "mybucket-agent-12345" # Replace with a globally unique S3 bucket name
file_path = "stores_and_cities.txt" # Local file path for the text file
s3_key = "stores/stores_and_cities.txt" # Path to store the file in the S3 bucket

# Step 1: Create the text file
create_text_file(file_path)
# Step 2: Create the bucket if it doesn't exist
create_bucket(bucket_name)
# Step 3: Upload the file to S3
upload_file_to_s3(bucket_name, file_path, s3_key)

if __name__ == "__main__":
main()


Step Three: Creating a Lambda function tool

Next, we will create a Lambda function that will fetch weather information for a given city name. This function will serve as a "tool" for the Bedrock agent, allowing it to dynamically retrieve real-time weather data when requested. The Lambda function will process requests from the agent, call a weather API, and return the current weather details for the specified city.
To create the function, first navigate to Lambda functions in the AWS console. Next, click the “Create function” button, which will take you to a new screen that will allow you to select the parameters for the function. We will select a basic python runtime for this tutorial.

After clicking create, you will see a source code editor. Simply paste the following code in to the editor and click the “deploy” button:
import json
import urllib.request
import urllib.parse
import urllib.error
import re

def lambda_handler(event, context):
def build_response(body, status="success"):
return {
'response': {
'actionGroup': event.get('actionGroup', 'unknown_actionGroup'),
'function': event.get('function', 'unknown_function'),
'functionResponse': {
'responseBody': {"TEXT": {"body": body}}
}
},
'messageVersion': event.get('messageVersion', '1.0')
}

# Extract 'city' from parameters
city = next((p['value'] for p in event.get('parameters', []) if p.get('name') == 'city'), None)
if not city:
return build_response('Error: Missing "city" parameter in request.')

if not re.match(r'^[A-Za-z\s\-]+$', city):
return build_response('Error: Invalid city name provided.')

try:
# Fetch weather data
url = f"https://wttr.in/{urllib.parse.quote(city)}?format=j1"
with urllib.request.urlopen(url) as response:
if response.status == 200:
data = json.loads(response.read())
current = data.get('current_condition', [{}])[0]
weather = (
f"Weather in {city}:\n"
f"Temperature: {current.get('temp_C', 'N/A')}°C\n"
f"Description: {current.get('weatherDesc', [{'value': 'N/A'}])[0].get('value', 'N/A')}\n"
f"Humidity: {current.get('humidity', 'N/A')}%"
)
return build_response(weather)
else:
return build_response(f"Error: Failed to fetch weather data. Status code: {response.status}")
except urllib.error.HTTPError as e:
return build_response(f"HTTP Error: {e.reason}")
except urllib.error.URLError as e:
return build_response(f"URL Error: {e.reason}")
except Exception as e:
return build_response(f"Internal Server Error: {str(e)}")



Step Four: Creating our Agent

Next, we will create our agent.
To begin, navigate to the Amazon Bedrock Console and search for “Agents.” Click the option to create a new agent, which will take you to the agent configuration screen. Provide a unique name for the agent, such as StoreWeatherAgent, and select a foundation model. We're using the new Nova Pro.


Once you’ve selected the model, move to the configuration phase where you’ll define the agent's behavior. In the Instructions section, you’ll define how the agent should process user queries. For example, you can instruct the agent to use the weather-checking tool to retrieve weather information for specific cities and reference the knowledge base to identify store locations.

Step Five: Linking our weather tool

Next, you need to associate tools with the agent.
Under the Tools section, add the Lambda function you created earlier. While adding the Lambda tool, configure a parameter called city. This parameter is required for the Lambda function to receive the city name dynamically during invocations. Set the parameter type to string and mark it as required to ensure proper functionality.


To ensure the Lambda function can be invoked by the Bedrock Agent, navigate to the Configuration > Permissions section in the AWS Lambda Console for the weather lambda function. Set the Principal to bedrock.amazonaws.com, specify the Source ARN of your Bedrock Agent (available in the Bedrock Console), and define the action as lambda:InvokeFunction. Save these permissions to allow the agent to call the Lambda function seamlessly.


Step Six: Linking the knowledge base

Finally, configure the knowledge base for the agent.
Navigate to the Knowledge Bases section of the Bedrock Console and create a new knowledge base. Provide a name, such as knowledge-base-quick-start, and select Amazon S3 as the data source. Point the knowledge base to the S3 bucket that we created earlier, which contains your store locations. Use the default parser to process the text and set the chunking strategy for embedding, and select the an embedding model of your preference. After reviewing the configuration, create the knowledge base.


Once the knowledge base is ready, link it to the agent. In the agent's Knowledge Base section, add the new knowledge base and specify how it should be used. For example, instruct the agent to use it to retrieve store locations when responding to queries about store weather or locations.


Finally, save and deploy the agent. After deployment, you can test the agent’s functionality directly from the Bedrock Console. For example, you can ask, "What is the weather in New York?" or "What is the weather at each of my stores?" The agent will dynamically pass the city name to the Lambda function, retrieve weather data, and combine it with store location data from the knowledge base to generate a complete response.
With the agent created, tools integrated, and the knowledge base linked, the Bedrock Agent is now fully functional. However, before the agent can be invoked or tested, you need to assign it an alias. Aliases allow you to manage different versions of your agent while maintaining a consistent identifier for integration.
To create an alias, navigate to the Aliases section within the Bedrock Agent console. Click the Create button, provide a unique alias name (e.g., v1), and associate it with the latest version of your agent. Once the alias is assigned, it will act as a stable reference point for invoking the agent, even as you update the underlying configuration or add new features. Save the alias configuration, and your agent is now ready for use.


Step Seven: Invoking our Bedrock Agent

To invoke the Bedrock Agent, you’ll use the AWS SDK for Python (boto3) to interact with the Bedrock Agent Runtime API. For more details on how to setup boto3 on your system, feel free to check out another article I wrote using AWS Bedrock, which explains how to do this. The invocation process involves sending a query to the agent along with its unique identifier, alias, and session details. Additionally, you can enable trace mode, which provides detailed debugging information about the agent's execution, including the tools it used, the sequence of operations it followed, and any interactions with the linked knowledge base.
The following script initializes a connection to the Bedrock Agent Runtime client and invokes the agent. If enable_trace mode is enabled, detailed trace data is returned, outlining the steps taken by the agent during execution. These traces, while highly informative, are complex and can be challenging to parse manually.
To simplify this process and ensure robustness against potential changes in the API’s trace payload structure, I employed a foundational model, Nova Pro, to summarize the traces into a concise and coherent explanation.
💡
This approach minimizes engineering effort while providing valuable insights into the agent's behavior.
import boto3
import json
import weave

weave.init("bedrock_invoke_agent") # Initialize weave tracking


def format_prompt_for_traces(traces, input_text):
"""
Formats the traces into a prompt for the LLM.

Parameters:
traces (list): List of traces to be summarized.

Returns:
str: The formatted prompt.
"""
return f"""
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Please analyze the following traces from an agent and provide a step-by-step summary:
Original user input: {input_text}
Traces:
{json.dumps(traces, indent=4)}

The summary should clearly outline:
- The main steps executed
- Any tools used
- Results retrieved from knowledge base
- Final outcome

Generate a coherent and concise summary.
<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
"""


def summarize_traces_with_llm(traces, input_text):
"""
Summarizes traces using an LLM.

Parameters:
traces (list): The list of trace dictionaries.

Returns:
str: A summary of the traces or an error message if summarization fails.
"""
try:
# Initialize the Bedrock runtime client
client = boto3.client(service_name="bedrock-runtime", region_name="us-east-1")
prompt = format_prompt_for_traces(traces, input_text)

# Prepare messages for Nova Pro API
messages = [
{
"role": "user",
"content": [{"text": prompt}]
}
]

# Invoke the model
print("Invoking Nova Pro model for summarization...")
response = client.converse(
modelId="us.amazon.nova-pro-v1:0",
messages=messages
)

# Extract response content
prediction = response["output"]["message"]["content"][0]["text"].strip()
return prediction

except Exception as e:
print(f"Error during LLM summarization: {e}")
return "Error summarizing traces."

@weave.op
def invoke_agent(agent_id, agent_alias_id, session_id, input_text, enable_trace=False):
"""
Invokes an Amazon Bedrock agent with optional debugging.

Parameters:
agent_id (str): The unique identifier of the agent.
agent_alias_id (str): The alias of the agent.
session_id (str): The session ID for maintaining context.
input_text (str): The prompt or query to send to the agent.
enable_trace (bool): Whether to enable trace mode for debugging.

Returns:
dict: A dictionary with 'response', 'traces', and a summary.
"""
try:
# Initialize the Bedrock runtime client
bedrock_client = boto3.client("bedrock-agent-runtime", region_name="us-east-2")

print("Invoking agent...")
print(f"Agent ID: {agent_id}, Alias ID: {agent_alias_id}, Session ID: {session_id}")
print(f"Input Text: {input_text}, Enable Trace: {enable_trace}")

# Call the invoke_agent API
response = bedrock_client.invoke_agent(
agentId=agent_id,
agentAliasId=agent_alias_id,
sessionId=session_id,
inputText=input_text,
enableTrace=enable_trace # enable tracing
)

# Initialize variables for response and trace
completion = ""
trace_data = []

# Iterate over the response stream
for event in response["completion"]:
try:
if "chunk" in event:
completion += event["chunk"]["bytes"].decode()
else:
trace_data.append(event) # Collect non-chunk trace data
except Exception as e:
print(f"Error processing event: {event}. Exception: {str(e)}")

# Summarize traces using the LLM
summary = summarize_traces_with_llm(trace_data, input_text)

# Return both response, trace data, and summary
return {
"response": completion,
"traces": trace_data,
"summary": summary
}

except Exception as e:
print("Exception occurred:")

return {
"error": str(e),
"response": None,
"traces": None,
"summary": "Error during invocation or summarization."
}

if __name__ == "__main__":
# Replace with your agent details
agent_id = "RTWDM0FTEU" # Your agent ID
agent_alias_id = "JFMNFQSLDS" # Your agent alias ID
session_id = "default-session" # A unique session ID
input_text = "what is the weather in each of the locations of my stores?"
enable_trace = True # Toggle tracing

print(f"Invoking agent with input: '{input_text}'")

# Call the invoke_agent function
result = invoke_agent(agent_id, agent_alias_id, session_id, input_text, enable_trace)

# Output the results
print("Agent Response:")
print(result["response"])

print("\nTraces:")
print(json.dumps(result["traces"], indent=4))

print("\nSummary:")
print(result["summary"])
After running our script, we can check out our traces inside Weave.
Here, we log our input, output, as well as the traces and trace summary inside Weave. This allows us to keep an eye on not only the Agent's final responses but also the reasoning and steps it took to arrive at those responses. This level of visibility is crucial when debugging issues, as it provides a detailed breakdown of how the agent processes inputs and interacts with tools and knowledge bases.
For example, the trace summary generated by the LLM not only highlights the key steps performed by the agent but also provides insights into the specific tools invoked, the data fetched, and any intermediate decisions made during the execution. This kind of insight is invaluable for identifying bugs, and validating the agent's logic.
Furthermore, by using Weave to visualize this data, patterns and anomalies can quickly be identified in real-time, allowing us to make data-driven decisions for improving the agent. For instance, if a particular tool invocation is repeatedly causing errors, we can isolate and address the issue quickly.
Similarly, tracking trace summaries can reveal areas where the agent's workflow might benefit from optimization or refinement.


This integration of W&B Weave with Bedrock Agents creates a powerful framework for monitoring your agent's performance. It ensures that all interactions, from the input query to the trace-level reasoning, are captured and accessible, enabling robust monitoring, debugging, and iterative improvement of the agent's performance.
As AI agents continue to become more and more powerful, tools like Weave will become incredibly important for ensuring reliability of these powerful agentic workflows.
💡

Use-cases

Specialized agents powered by Amazon Bedrock are poised to revolutionize diverse industries by automating complex tasks and augmenting human capabilities.
  • Imagine a Code Debugger Agent that not only identifies errors but also automatically generates fixes by referencing code repositories and documentation.
  • Or a Research Copilot Agent that continuously analyzes new publications, identifies relevant research gaps, and even suggests potential research directions.
  • In customer service, an AI-powered agent could handle a wide range of inquiries, from providing product information and tracking orders to resolving complex technical issues.
By monitoring these agents, companies can identify common customer pain points, optimize responses, and ensure consistent, high-quality service.
These are just a few examples of how specialized agents can transform workflows. By effectively monitoring these agents, we can ensure accuracy, efficiency, and continuous improvement across various domains.

Conclusion

Amazon Bedrock Agents offer a powerful platform for creating intelligent, context-aware solutions that streamline workflows and enhance efficiency. By incorporating real-time data fetching, robust monitoring with tools like W&B Weave, and detailed trace analysis, you can unlock the full potential of these AI-driven systems.
As workflows become more complex, monitoring and analyzing agent behavior is essential for ensuring reliability, optimizing performance, and adapting to evolving needs. Whether you're building specialized agents for research, customer service, or any other domain, Bedrock provides the flexibility and intelligence to drive innovation and success.
Ready to explore the possibilities? Start building your own Amazon Bedrock Agents today.


Iterate on AI agents and models faster. Try Weights & Biases today.