Skip to main content

Amazon Bedrock AgentCore observability guide

Learn how Amazon Bedrock AgentCore enables secure, scalable AI agent deployment with built‑in observability—and how integrating W&B Weave enhances traceability and performance insights for developers.
Created on July 28|Last edited on July 31
In the era of AI-driven applications, building autonomous agents that can operate reliably at scale is crucial. Amazon Bedrock AgentCore is a powerful suite of services designed to do precisely that.
In this article, we'll focus on how AgentCore supports large-scale agent deployment with robust observability, how integrating W&B Weave can enhance that observability, and how developers and AI practitioners can leverage AgentCore and Weave to create effective, trustworthy AI agents in production.
Let's jump in.


Table of contents



Amazon Bedrock AgentCore

Amazon Bedrock AgentCore enables organizations to move AI agents from prototypes to production-ready systems. It combines enterprise-grade security and reliability with the flexibility to use any open-source AI framework or model. In practice, this means you don’t have to choose between cutting-edge open-source tools and secure, scalable infrastructure. AgentCore provides both.
A major highlight of AgentCore is its focus on observability: it offers real-time insight into agent behavior and performance, which is essential for debugging and optimizing complex AI workflows. By enhancing these observability features with W&B Weave, developers get even deeper visibility into an agent’s decision-making process with interactive dashboards and trace visualization.
Ultimately, this leads to reduced deployment headaches, improved debugging capabilities, and increased confidence in the agents built with both solutions.

What is Amazon Bedrock AgentCore?

Amazon Bedrock AgentCore helps teams move AI agents from concept to scale by combining enterprise‑grade security and flexibility to use open‑source frameworks or any LLM. It delivers live observability through CloudWatch, tracking token usage, latency, tool calls, sessions, and error rates. Integrated with W&B Weave, you gain fine‑grained trace visualizations and interactive dashboards to expose each decision step. The result? Easier deployments, faster debugging, and greater confidence in production AI agents.
In simple terms, Amazon Bedrock AgentCore provides many of the necessary infrastructure and tools to run AI-driven agents in production, eliminating the need for developers to manage servers or complex backend systems. AgentCore is framework-agnostic and model-agnostic: you can use it with popular open-source agent frameworks (like LangGraph, CrewAI, or Strands) as well as any large language model (whether from AWS Bedrock’s model hub or external providers).
This flexibility means you can accelerate AI agents from proof-of-concept to production while maintaining enterprise-grade security and reliability. AgentCore also enables organizations to scale their agents to potentially millions of users by handling aspects such as computation, memory, and authentication, allowing developers to focus on the agent’s logic and capabilities rather than infrastructure.

Key components of Amazon Bedrock AgentCore

Amazon Bedrock AgentCore is composed of several modular services, each responsible for a different aspect of an AI agent’s operation. These components can be used together seamlessly or integrated independently as needed:
  • AgentCore Runtime: A secure, serverless environment for hosting and scaling AI agents. The Runtime supports interactive workloads with low latency and also long-running tasks (up to eight hours) for complex reasoning. Each agent session runs in an isolated micro-VM, providing true session isolation for security. Developers can deploy agents on this runtime without worrying about servers, and it works with any framework or protocol, enabling quick cloud deployment of local agent code.
  • AgentCore Identity: A managed identity and access service for agents. AgentCore Identity integrates with existing identity providers (like Amazon Cognito, Okta, or Microsoft Entra ID), so you can authenticate users and agents without rebuilding your auth system. It provides a secure token vault and just-enough access permissions, allowing agents to securely access AWS resources and third-party APIs on a user’s behalf. This ensures agents operate within safe privilege levels and maintain trustworthy interactions with other services.
  • AgentCore Memory: A scalable memory store for AI agents to maintain context. Memory offers both short-term memory for multi-turn conversations and long-term memory that can be shared across sessions or even across different agents. It eliminates the need for developers to manage complex memory databases, yet gives full control over what the agent “remembers.” By using AgentCore Memory, you can build context-aware agents with high accuracy in recalling relevant information, improving coherence and personalization in agent responses.
  • AgentCore Code Interpreter: A secure code execution sandbox that lets agents run code for advanced reasoning or data processing. This tool allows AI agents to execute Python code (and potentially other languages) in a controlled environment, which is useful for tasks like performing calculations, data analysis, or generating visualizations on the fly. The code interpreter is isolated from the rest of the system, so agents can perform complex computations without risking the host environment’s security. Developers can also configure these sandboxes (e.g. choosing specific compute resources or libraries) to meet their application’s needs.
  • AgentCore Browser: A cloud-based browser automation tool that enables AI agents to interact with websites and web applications. The AgentCore Browser provides a headless, secure browser instance that an agent can use to navigate pages, click buttons, fill forms, or scrape information from the web – all at scale and with enterprise security. This built-in tool means an agent can perform web-based tasks (such as checking a news site or submitting data to a web form) without custom browser automation code or external services, and without exposing the system to the risks of an uncontrolled web browser.
  • AgentCore Gateway: A service that lets agents discover and use external tools and APIs easily. With Gateway, you can take existing APIs, AWS Lambda functions, or cloud services and transform them into agent-compatible tools. Essentially, it eliminates weeks of custom integration work by providing a bridge between your agent and the outside services it needs. The Gateway handles authentication and security for these tool calls, so your agent can, for example, call an internal company API or a third-party service safely. This dramatically simplifies extending an agent’s capabilities with new actions or data sources, since you don’t have to build the integration from scratch.
  • AgentCore Observability: A built-in observability and monitoring service for tracing and debugging agents. AgentCore Observability provides unified dashboards (powered by Amazon CloudWatch) where you can watch your agent’s performance in real time. It collects telemetry on key metrics such as token usage, response latency, session duration, and error rates. Moreover, it can trace each step of an agent’s workflow – including which tools were invoked and how the model responded at each step – giving developers deep visibility into the agent’s decision-making process. This is crucial for debugging, auditing, and maintaining quality as agents operate in production.

How does AgentCore Runtime ensure secure deployment?

Security is at the heart of AgentCore’s design, especially in the Runtime environment. First, each agent session in the AgentCore Runtime runs inside its own isolated microVM. This means that the CPU, memory, and file system for one agent session are completely separated from any other session. If one agent (or one user’s session) encounters an error or even a malicious input, it cannot escape its sandbox or affect other running agents. When a session ends, the microVM is terminated and its memory is wiped (sanitized), ensuring no sensitive data persists between sessions. This level of isolation protects against cross-session data leaks and makes deployment more secure by design.
AgentCore’s security extends to how agents access external resources. The Runtime is tightly integrated with AgentCore Identity, which handles authentication tokens and permissions. Each agent is given just-enough access to perform its tasks. For example, if an agent needs to read from an S3 bucket or call a third-party API, AgentCore Identity can provide temporary scoped credentials or API keys for that specific purpose. This ensures that agents operate with the principle of least privilege – they only have access to the resources they truly need, for the duration they need them.
By using secure token delegation and compatibility with enterprise identity providers, AgentCore Runtime prevents unauthorized access and protects user data. In short, the combination of isolated execution environments and robust identity management means developers can deploy AI agents with confidence that the underlying infrastructure and access points are secure.

How can developers integrate AgentCore into their AI frameworks?

Integrating AgentCore services into your AI framework involves using the provided SDKs and APIs to tie in features like the Runtime, Memory, and tools to your agent’s code. AgentCore is modular, so you can adopt it gradually – for instance, start by using the AgentCore Runtime to host your agent, then add AgentCore Memory for persistence, and so on. Amazon provides a unified SDK that wraps all these capabilities, making it straightforward to incorporate them without significant refactoring.
In practice, if you have an agent built with a popular framework (say LangChain or another open-source library), you can deploy it on AgentCore with only a few modifications. For example, you might initialize an AgentCore Runtime session in place of your local runtime, or use the SDK to register your agent code so that it runs in the cloud environment. The AgentCore Runtime can take your existing agent logic and transform it into a cloud-native deployment with just a few lines of code. This works seamlessly with frameworks like LangGraph, Strands, CrewAI, or even custom agent implementations – you don’t have to rewrite your agent from scratch.
Beyond the runtime hosting, you can integrate other services similarly. Suppose your agent needs long-term memory: you can create an AgentCore Memory resource via the API and have your agent store and retrieve conversation context from it, instead of using an in-memory Python list or external database. If your agent calls external APIs or AWS services, you can route those calls through AgentCore Gateway, which exposes them as tools the agent can invoke securely. Integration might be as simple as defining the API endpoint to the Gateway and letting AgentCore handle the authentication and tooling interface. Likewise, if your agent would benefit from executing code, you can invoke the AgentCore Code Interpreter by sending code snippets to it through the SDK, rather than executing code directly in your agent process.
Throughout this process, AgentCore’s compatibility with any model means you’re not locked to a specific ML provider. You could be using Amazon’s Titan models, OpenAI’s models, or any other via Bedrock’s interface – AgentCore will support your choice. The key takeaway is that Amazon Bedrock AgentCore plays nicely with your existing development workflow: it augments your agents with cloud-based support (hosting, memory, identity, tools) without requiring you to abandon your preferred frameworks or models. By integrating these services, developers can rapidly scale up their AI agents and ensure they are secure and observable, all with minimal code changes.

How does AgentCore Observability enhance performance visibility?


AgentCore Observability provides developers with real-time visibility into their AI agents through integrated metrics and tracing. Out of the box, AgentCore automatically collects a range of metrics on your agents and the resources they use. These metrics are fed into Amazon CloudWatch (with a dedicated Generative AI Observability dashboard) where you can see key performance indicators at a glance. For example, you can monitor how many tokens an agent is using per request, the latency of each model inference call, the duration of sessions, and error rates or failure counts. Having these metrics readily available means you can quickly spot anomalies – such as a spike in error rate or unusually long response times – and act before they become serious problems.
Beyond basic metrics, AgentCore’s observability includes detailed tracing of agent workflows. Every action an agent takes (like invoking a tool via the Gateway, retrieving from Memory, calling a foundation model, etc.) can be traced and logged. The system uses OpenTelemetry-compatible telemetry data, which means it can capture spans and traces for each step of an agent’s execution. Developers can dive into these traces to see, for instance, how the agent decided on a particular tool, how long each step took, and what the intermediate results were. This level of detail is invaluable for debugging complex agent logic or misbehaviors because you can follow the agent’s thought process recorded in the traces.
AgentCore Observability is also built to integrate with existing monitoring systems. Since it relies on standard telemetry (and CloudWatch), you can export or forward these observability data to other tools if needed, or combine them with your application’s broader monitoring. In summary, AgentCore’s observability features enhance performance visibility by providing unified dashboards and logs for your agent’s operations. Developers get a live window into how their agents are performing and why, which not only helps in troubleshooting but also in optimizing the agent’s design (for example, adjusting prompts or tool usage based on observed behavior). Next, we’ll see how we can take these insights even further by using W&B Weave for enhanced observability and debugging capabilities.

Tutorial: Enhancing AgentCore observability with W&B Weave

While Amazon Bedrock AgentCore provides solid baseline monitoring for AI agents, you can gain deeper insights by using external observability tools. Weights & Biases (W&B) Weave is a powerful tool that complements AgentCore by capturing detailed traces and enabling interactive visualization of an agent's behavior. In this tutorial, we'll walk through a complete, working guide on how to use W&B Weave to enhance the observability of an AgentCore-powered AI agent.

Prerequisites

  • Python 3.11
  • AWS account with Bedrock access
  • W&B account (free at wandb.ai)
  • AWS credentials configured (via AWS CLI, environment variables, or IAM roles)
  • Docker
  • uv installed (via curl -LsSf https://astral.sh/uv/install.sh | sh)

Step 1: Set up W&B Weave in your environment

First, install the required packages:
pip install wandb boto3 strands-agents strands-agents-tools weave
Next, authenticate with W&B and initialize Weave in your script:
import wandb
import weave
import boto3
import json
from datetime import datetime
from typing import Dict, Any, Optional

# Login to W&B (you'll be prompted for your API key if not already logged in)
wandb.login()

# Initialize Weave with a project name
weave.init("bedrock-agent-observability")

Step 2: Enable Bedrock Models

In order to do this tutorial, you need an AWS account with Bedrock access in your preferred region. Claude model access must be explicitly enabled for your account and region in the AWS Console under Bedrock. For this tutorial, I used anthropic.claude-3-5-sonnet-20240620-v1:0, for example, in us-east-1. For more details on how to enable the model, see this previous tutorial.

Step 3: Set up Your local project and IAM Permissions

Create your working directory and set up the Python environment using uv. Inside your workspace, run:
mkdir my-custom-agent && cd my-custom-agent
uv init --python 3.11
uv add fastapi 'uvicorn[standard]' pydantic httpx strands-agents
To make the following AWS and Docker commands easier, export your values for region, account ID, agent naming, ECR repo, and role name.
export REGION=us-east-1
export ACCOUNT_ID=your_aws_account_id
export AGENT_NAME=myagent
export ECR_REPO=my-agent-repo
export ROLE_NAME=AgentCoreRuntime-ExecutionRole-${AGENT_NAME}
export IMAGE=${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/${ECR_REPO}:latest
export EXECUTION_ROLE_ARN=arn:aws:iam::${ACCOUNT_ID}:role/${ROLE_NAME}
Replace all placeholders with your real AWS account values and consistently use the same region you enabled Bedrock/Claude access for. Now we will create a 'trust policy' for our agent. Save the following trust policy as trust-policy.json. The role must allow bedrock-agentcore.amazonaws.com to assume it for your AWS account and region.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AssumeRolePolicy",
"Effect": "Allow",
"Principal": {
"Service": "bedrock-agentcore.amazonaws.com"
},
"Action": "sts:AssumeRole",
"Condition": {
"StringEquals": {
"aws:SourceAccount": "YOUR_AWS_ACCOUNT_ID"
},
"ArnLike": {
"aws:SourceArn": "arn:aws:bedrock-agentcore:YOUR_REGION:YOUR_AWS_ACCOUNT_ID:*"
}
}
}
]
}
Create your role with:
aws iam create-role \
--role-name $ROLE_NAME \
--assume-role-policy-document file://trust-policy.json \
--region $REGION
We will also create another JSON file called policy.json:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ECRImageAccess",
"Effect": "Allow",
"Action": [
"ecr:BatchGetImage",
"ecr:GetDownloadUrlForLayer"
],
"Resource": [
"arn:aws:ecr:YOUR_REGION:YOUR_AWS_ACCOUNT_ID:repository/*"
]
},
{
"Effect": "Allow",
"Action": [
"logs:DescribeLogStreams",
"logs:CreateLogGroup"
],
"Resource": [
"arn:aws:logs:YOUR_REGION:YOUR_AWS_ACCOUNT_ID:log-group:/aws/bedrock-agentcore/runtimes/*"
]
},
{
"Effect": "Allow",
"Action": [ "logs:DescribeLogGroups" ],
"Resource": [ "arn:aws:logs:YOUR_REGION:YOUR_AWS_ACCOUNT_ID:log-group:*" ]
},
{
"Effect": "Allow",
"Action": [
"logs:CreateLogStream",
"logs:PutLogEvents"
],
"Resource": [
"arn:aws:logs:YOUR_REGION:YOUR_AWS_ACCOUNT_ID:log-group:/aws/bedrock-agentcore/runtimes/*:log-stream:*"
]
},
{
"Sid": "ECRTokenAccess",
"Effect": "Allow",
"Action": [ "ecr:GetAuthorizationToken" ],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": [
"xray:PutTraceSegments",
"xray:PutTelemetryRecords",
"xray:GetSamplingRules",
"xray:GetSamplingTargets"
],
"Resource": [ "*" ]
},
{
"Effect": "Allow",
"Resource": "*",
"Action": "cloudwatch:PutMetricData",
"Condition": {
"StringEquals": {
"cloudwatch:namespace": "bedrock-agentcore"
}
}
},
{
"Sid": "GetAgentAccessToken",
"Effect": "Allow",
"Action": [
"bedrock-agentcore:GetWorkloadAccessToken",
"bedrock-agentcore:GetWorkloadAccessTokenForJWT",
"bedrock-agentcore:GetWorkloadAccessTokenForUserId"
],
"Resource": [
"arn:aws:bedrock-agentcore:YOUR_REGION:YOUR_AWS_ACCOUNT_ID:workload-identity-directory/default",
"arn:aws:bedrock-agentcore:YOUR_REGION:YOUR_AWS_ACCOUNT_ID:workload-identity-directory/default/workload-identity/${AGENT_NAME}-*"
]
},
{
"Sid": "BedrockModelInvocation",
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": [
"arn:aws:bedrock:*::foundation-model/*",
"arn:aws:bedrock:YOUR_REGION:YOUR_AWS_ACCOUNT_ID:*"
]
}
]
}
Go ahead and attach the policy with the following command:
aws iam put-role-policy \
--role-name $ROLE_NAME \
--policy-name AgentCoreRuntimePolicy \
--policy-document file://policy.json \
--region $REGION

Step 4: Create a ECR Repository

Run the following command to create a ECR Repo:
aws ecr create-repository --repository-name $ECR_REPO --region $REGION

Step 5: Docker Login to ECR

Run the following Authenticate Docker to your private ECR registry:
aws ecr get-login-password --region $REGION | docker login --username AWS --password-stdin $ACCOUNT_ID.dkr.ecr.$REGION.amazonaws.com
This logs Docker into your private Amazon ECR registry using a temporary password, so you can push images to (or pull from) your ECR repository.

Step 6: Write your agent code

The Bedrock Claude model and region must be specified explicitly in your agent code. Use the following as agent.py. Always set region_name to the exact region your model access is enabled for, otherwise your agent will fail on invocation. Here's the code for the agent which you can save as agent.py:
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
from typing import Dict, Any
from datetime import datetime

from strands import Agent, tool
from strands.models import BedrockModel

@tool
def word_count(text: str) -> int:
return len(text.split())

@tool
def reverse(text: str) -> str:
return text[::-1]

bedrock_model = BedrockModel(
model_id="anthropic.claude-3-5-sonnet-20240620-v1:0",
temperature=0.3,
streaming=False,
region_name="us-east-1" # IMPORTANT: set this to your exact Claude-enabled region
)

strands_agent = Agent(
model=bedrock_model,
tools=[word_count, reverse],
system_prompt="You are a helpful assistant who uses tools when they help."
)

app = FastAPI(title="Strands Agent Server", version="1.0.0")

class InvocationRequest(BaseModel):
input: Dict[str, Any]

class InvocationResponse(BaseModel):
output: Dict[str, Any]

@app.post("/invocations", response_model=InvocationResponse)
async def invoke_agent(request: InvocationRequest):
try:
user_message = request.input.get("prompt", "")
if not user_message:
raise HTTPException(status_code=400, detail="No prompt found in input. Please provide a 'prompt' key in the input.")
result = strands_agent(user_message)
response = {
"message": result.message,
"timestamp": datetime.utcnow().isoformat(),
"model": bedrock_model.config["model_id"],
}
return InvocationResponse(output=response)
except Exception as e:
raise HTTPException(status_code=500, detail=f"Agent processing failed: {str(e)}")

@app.get("/ping")
async def ping():
return {"status": "healthy"}

if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8080)

Two simple tools (word_count and reverse) are defined and registered with the agent, allowing it to extend its abilities. The /invocations endpoint takes a JSON payload, forwards the prompt to the Claude model, and returns the result along with the model info and a timestamp.

Step 7: Create a Dockerfile

Create a Dockerfile that uses ARM64 Python 3.11 uv base for Bedrock AgentCore. Place agent.py, pyproject.toml and uv.lock at the root:
FROM --platform=linux/arm64 ghcr.io/astral-sh/uv:python3.11-bookworm-slim
WORKDIR /app
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-cache
COPY agent.py ./
EXPOSE 8080
CMD ["uv", "run", "uvicorn", "agent:app", "--host", "0.0.0.0", "--port", "8080"]
This Dockerfile builds an ARM64 image based on a slim Python 3.11 container, which is required for Bedrock container runtime compatibility.

Step 8: Build the Docker image and Push to AWS

Build and push the image. This must be ARM64 for Bedrock.
docker buildx build --platform linux/arm64 -t $IMAGE --push .
Then, validate the image is in ECR:
aws ecr describe-images --repository-name $ECR_REPO --region $REGION

Step 9: Register your agent runtime

You can register the runtime via console or boto3. Here is how to do it using boto3 in Python:
import boto3
# Set your account info here
ACCOUNT_ID = “your”_id” # <-- Replace with your AWS account ID
ROLE_ARN = “your agent ARN” # <-- Replace with your role ARN

client = boto3.client('bedrock-agentcore-control', region_name='us-east-1')

response = client.create_agent_runtime(
agentRuntimeName='strands_agent',
agentRuntimeArtifact={
'containerConfiguration': {
'containerUri': f'{ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/my-strands-agent:latest'
}
},
networkConfiguration={
"networkMode": "PUBLIC"
},
roleArn=ROLE_ARN
)

print("Agent Runtime created successfully!")
print(f"Agent Runtime ARN: {response['agentRuntimeArn']}")
print(f"Status: {response['status']}")

Step 10: Invoking your agent

After registration, invoke your agent and pass in a payload using the Python SDK. Generate a random session ID for best practice.
import boto3
import json
import random
import string
import weave

# 1. Initialize Weave Tracing (use your project name)
weave.init("bedrock-agentcore-demo")

# 2. Trace the core agent call as a "Call" with Weave.
@weave.op
def invoke_bedrock_agent(prompt: str):
agent_core_client = boto3.client('bedrock-agentcore', region_name='us-east-1')
agent_runtime_arn = "Your_Agent_Runtime_ARN_Here" # <-- Replace with your actual agent runtime ARN

# Build session id for Bedrock API (randomized, 40 chars)
random_part = ''.join(random.choices(string.ascii_letters + string.digits, k=40))
runtime_session_id = f"strandsagent-{random_part}"

payload = json.dumps({
"input": {"prompt": prompt}
})

response = agent_core_client.invoke_agent_runtime(
agentRuntimeArn=agent_runtime_arn,
runtimeSessionId=runtime_session_id,
payload=payload,
qualifier="DEFAULT"
)

response_body = response['response'].read()
try:
response_data = json.loads(response_body)
except Exception:
response_data = response_body # fallback if not JSON

return response_data

if __name__ == "__main__":
# This call is tracked in Weave!
resp = invoke_bedrock_agent(
"how many words are in the string 'ML is cool' ?"
)

try:
text = resp["output"]["message"]["content"][0]["text"]
print(text)
except Exception as e:
print("Could not extract summary text:", e)
The provided code demonstrates how to invoke your deployed Bedrock agent and track these executions using Weave for observability and debugging. Weave tracing is initialized for your workspace using weave.init("bedrock-agentcore-demo"). This ensures all subsequent tracked function calls are recorded to your project on wandb.ai. You then define a function called invoke_bedrock_agent, decorated with @weave.op so that every call to it is automatically logged as an operation in the Weave system. Inside this function, you create a boto3 client for AWS Bedrock agent core in the region your agent is running. The ARN for your deployed agent runtime must be supplied (replace the placeholder string with your real agent ARN from registration)

Step 11: Visualize agent Traces in Weave

After running your agent, you can view the detailed traces in the W&B interface:

  1. Access your project: Go to wandb.ai and navigate to your "bedrock-agent-observability" project
  2. View traces: Click on the "Traces" tab to see all recorded operations
  3. Explore individual calls: Click on any trace to see:
    • Input parameters for each function
    • Output results
    • Execution timing
    • Token usage and costs
    • Error details (if any)

Key Observability Features

Function-level tracing
Every decorated function (@weave.op()) appears as a node in the trace tree, showing inputs, outputs, and execution time.
Model call details
Each Bedrock API call is logged with:
  • Model ID and configuration
  • Input messages/prompts
  • Generated responses
  • Token usage statistics
  • Response metadata
Error tracking
Failed calls are captured with error messages and stack traces.
Performance metrics
You can analyze:
  • Response latency for each component
  • Token consumption patterns
  • Success/failure rates

Conclusion

Amazon Bedrock AgentCore, combined with the observability enhancements of W&B Weave, offers a robust solution for deploying, monitoring, and managing AI agents at scale. We’ve seen that AgentCore provides the necessary infrastructure — from secure runtime sandboxes and memory stores to integrated identity management and tool access — which allows AI agents to operate reliably in real-world, production settings. Its built-in Observability service, especially when paired with CloudWatch, gives developers real-time dashboards and telemetry to keep track of an agent’s performance and health.
By adding W&B Weave into the mix, we further enhanced this observability, gaining fine-grained traces of the agent’s decisions and actions. This level of insight is incredibly valuable for debugging complex agent behaviors and continuously improving them. Developers can now not only ensure their agents are secure and scalable thanks to AgentCore, but also ensure they are transparent and testable thanks to the monitoring and tracing tools from both AWS and W&B.
Looking ahead, the importance of observability in AI will only grow. As AI agents become more autonomous and are entrusted with more critical tasks, having detailed visibility into their reasoning and performance is vital for trust and safety. The combination of cloud-scale agent infrastructure with advanced logging and visualization tools sets the stage for a new era of AI development. Developers can iterate faster, catch issues earlier, and deploy with greater confidence. In the future, we can expect even tighter integration between platforms like AWS Bedrock and W&B’s suite (Weave, Models, and more), providing end-to-end solutions for building and maintaining highly effective AI agents. By embracing these tools and best practices now, you’ll be well-equipped to create AI agents that are not only powerful and scalable, but also observable, auditable, and continuously improvable – qualities that will define successful AI systems in the years to come.

Iterate on AI agents and models faster. Try Weights & Biases today.