Claude 3.5 Sonnet on Vertex AI: Python quickstart
Here's how to get up and running with the newest model from Anthropic
Created on June 21|Last edited on November 13
Comment
This tutorial is designed to help you get started with the Claude 3.5 Sonnet model on Google Cloud’s Vertex AI using Python. We'll guide you through setting up your environment, enabling the necessary APIs, and using the Claude 3.5 Sonnet API with the Anthropic Python SDK and W&B Weave for logging.
Let's get started.

What we'll cover
What is Claude 3.5 Sonnet?Setting up Vertex AI for Claude 3.5 SonnetStep 1: Create a Google Cloud projectStep 2: Enable the Vertex AI APIStep 3: Set up the Google Cloud CLIStep 4: Configure IAM RolesStep 5: Enable Claude 3.5 Sonnet in Google Cloud Using the Claude 3.5 Sonnet API with W&B Weave Using the Claude 3.5 Sonnet API to analyze imagesAnother "smart" modelConclusion
What is Claude 3.5 Sonnet?
Claude 3.5 Sonnet, the latest model from Anthropic, stands out for its exceptional performance in tasks such as text generation, coding, and visual processing. Featuring a 200K token context window, it provides substantial capacity for managing large and intricate inputs.
Accessible via Google Cloud's Vertex AI, the Anthropic API, Amazon Bedrock, and the Claude platform, Claude 3.5 Sonnet integrates seamlessly into various workflows. Priced competitively at $3 per million input tokens and $15 per million output tokens, it offers a cost-effective solution for a range of AI applications.
The model operates at twice the speed of its predecessor—Claude 3 Opus—making it perfect for complex tasks like context-sensitive customer support and orchestrating multi-step workflows. Additionally, with support for multiple languages, Claude 3.5 Sonnet is highly versatile and widely applicable.
Setting up Vertex AI for Claude 3.5 Sonnet
First, we'll cover the setup of Vertex AI, starting with creating a Google Cloud project, enabling the necessary APIs, and configuring the Google Cloud CLI. This foundation will ensure that you have the tools and permissions required to fully utilize Vertex AI's features. Then, we'll explore how to access and use the Claude 3.5 Sonnet model via the Vertex AI platform. This includes setting up authentication, sending API requests, and handling the model's responses.
By the end of this quickstart tutorial, you'll be equipped with the knowledge to seamlessly integrate Claude 3.5 Sonnet into your workflows, taking full advantage of its superior performance in various applications. Whether you're looking to enhance your coding processes, improve customer interactions, or gain deeper insights from data, this guide will help you harness the capabilities of one of the most advanced AI models available today.
Setting up Vertex AI on Google Cloud involves several key steps to ensure you have the necessary infrastructure and permissions in place.
Here’s how you can get started:
Step 1: Create a Google Cloud project
Begin by creating a new project in the Google Cloud console. Navigate to the project selector page and either select an existing project or create a new one. Ensure that billing is enabled for your project, as this is required for using Vertex AI services. If you haven't yet created a project, simply search 'create project' in the Google Cloud search bar and you can easily click the first result which will guide you to create a project.

Step 2: Enable the Vertex AI API
Next, enable the Vertex AI API for your project. In the Google Cloud console, enter “Vertex AI” in the search bar. Select Vertex AI from the results, which will bring you to the Vertex AI dashboard. Click on “Enable All Recommended APIs” to activate the necessary APIs for Vertex AI. This process may take a few moments to complete.

Step 3: Set up the Google Cloud CLI
To interact with Google Cloud services from your local development environment, you need to install the Google Cloud CLI. Download and install the CLI from the Google Cloud documentation. Once installed, initialize the CLI by running gcloud init in your terminal. This command will guide you through selecting your project and configuring your settings.
You can update the CLI components to ensure you have the latest tools and features by running:
gcloud components updategcloud components install beta
Step 4: Configure IAM Roles
The administrator must ensure the appropriate IAM roles are assigned. These roles include:
- Vertex AI User or Vertex AI Administrator, and
- Service Account User
Depending on your specific needs and intended use of Vertex AI. I recommend Vertex AI Administrator and Service Account User permissions for this tutorial.
To accomplish this, simply search "IAM" in the Google Cloud Search bar, and you will be able to

You will then select the edit button next to your user account, which looks like the following:

And assign the appropriate roles:

Step 5: Enable Claude 3.5 Sonnet in Google Cloud
Now we can navigate to the following page, which will enable us to enable the model we would like to use. If you scroll down on the page, you will see a section that looks like this:

Simply, click the blue model card button, and you will be directed to a page that will allow you to enable the Claude 3.5 Sonnet in Vertex. Next, click the "Enable" button:

Now, the Claude 3.5 Sonnet model should be available for us to use. If you get any errors following these steps, your best bet may be simply waiting a few minutes, as these changes may take a few minutes to take full effect on Google's servers.
Using the Claude 3.5 Sonnet API with W&B Weave
Now, we are ready for the fun part! We will use Claude 3.5 Sonnet API in Python. Additionally, we will utilize W&B Weave for tracking the inputs and outputs to our model. Weave is a powerful tool from Weights & Biases designed to simplify the logging and monitoring of GenAI models. It provides an easy-to-use interface for tracking various metrics, visualizing data, and examining model performance in real time. By integrating Weave into any machine learning project, users can enhance their ability to monitor and evaluate their models' inner workings.
To use Weave, start by initializing it with weave.init('your_project_name'). Next, add the @weave.op() decorator to any function you wish to track. This decorator automatically logs all inputs and outputs of the function, capturing detailed information about its operation.
To start off, first install Weave and Vertex with the following commands:
python -m pip install -U 'anthropic[vertex]'python -m pip install -U Weave
Next, we can write the following script which will allow us to call the Claude API and log our results to Weave. Note you will need to replace your_project_name with your desired Weave project name, as well as your-gcp-project-id with your Google Cloud project ID.
import base64from anthropic import AnthropicVerteximport weave# Initialize Weave for loggingweave.init('your_project_name')# Set up the Anthropic clientLOCATION = "us-east5"PROJECT_ID = "your-gcp-project-id"client = AnthropicVertex(region=LOCATION, project_id=PROJECT_ID)@weave.op()def get_response_from_anthropic(query):response = client.messages.create(max_tokens=1024,messages=[{"role": "user","content": query}],model="claude-3-5-sonnet@20240620",)return response.model_dump_json(indent=2)# Define the queryquery = "How can I improve my eating habits?"# Call the function and log the responseresponse = get_response_from_anthropic(query)print(response)
This script will call the Claude 3.5 Sonnet API and generate a response for the query. Additionally, the response will be automatically logged to Weave, since we have added the @weave.op() decorator above the function. Inside our project dashboard in Weights & Biases, we can navigate to the Weave section, and we will see our response.
Weave logs each call to the function, including the inputs and outputs:

Using the Claude 3.5 Sonnet API to analyze images
Now, let's use Claude 3.5 Sonnet to analyze images, integrating Python along with Weave for effective tracking and logging. This process will guide you through converting images into a suitable format for the Claude API and logging the responses using Weave.
Here's a comprehensive script that includes all necessary steps:
import base64import requestsfrom pathlib import Pathfrom anthropic import AnthropicVerteximport weaveimport httpx# Initialize Weave for loggingweave.init('claude_sonnet')# Set up the Anthropic clientLOCATION = "us-east5"PROJECT_ID = "your-gcp-project-id"client = AnthropicVertex(region=LOCATION, project_id=PROJECT_ID)def image_to_data_url(file_path_or_url):EXT_TO_MIMETYPE = {'.jpg': 'image/jpeg','.png': 'image/png','.svg': 'image/svg+xml',}if file_path_or_url.startswith('http://') or file_path_or_url.startswith('https://'):response = requests.get(file_path_or_url)if response.status_code != 200:raise ValueError(f"Unable to fetch image from URL: {file_path_or_url}")content_type = response.headers['Content-Type']mimetype = content_type if content_type in EXT_TO_MIMETYPE.values() else Noneimage_data = response.contentelse:ext = Path(file_path_or_url).suffixmimetype = EXT_TO_MIMETYPE.get(ext)if not mimetype:raise ValueError(f"Unsupported file extension: {ext}")with open(file_path_or_url, 'rb') as image_file:image_data = image_file.read()if not mimetype:raise ValueError(f"Unsupported mime type: {mimetype}")encoded_string = base64.b64encode(image_data).decode('utf-8')data_url = f"data:{mimetype};base64,{encoded_string}"return data_url@weave.op()def analyze_image(image_data, prompt, wv_img): # wv_image added for trackingresponse = client.messages.create(max_tokens=1024,messages=[{"role": "user","content": [{"type": "image","source": {"type": "base64","data": image_data,"media_type": "image/jpeg" # Add the media type here},},{"type": "text","text": prompt,},]}],model="claude-3-5-sonnet@20240620",)return response.model_dump_json(indent=2)image_url = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"image_b64 = base64.b64encode(httpx.get(image_url).content).decode("utf-8")weave_img = image_to_data_url(image_url)# Call the function and log the responseimage_analysis_response = analyze_image(image_b64, "describe this image", weave_img)print(image_analysis_response)
In this script, we cover all the necessary steps to analyze an image using Claude 3.5 Sonnet.
First, we initialize Weave for logging and set up the Anthropic client with the required Google Cloud project details, ensuring seamless interaction with both Weave and the Claude API. The image_to_data_url function then converts an image, whether from a URL or a local file, into a base64-encoded data URL, which is essential for the Claude API's processing requirements. The analyze_image function sends this base64-encoded image to the Claude API for analysis and logs the response using Weave.
We pass two strings to this function: image_data (the base64-encoded image) and wv_img (the same image, but compatible with Weave). Finally, we fetch an image from a URL, convert it to a base64 string, and call the analyze_image function to obtain and print the response, demonstrating how to use the entire setup to analyze an image and log the interaction. This consolidated approach ensures effective use of Claude 3.5 Sonnet for image analysis, leveraging its advanced capabilities and integrating with Weave for comprehensive tracking and monitoring of your results.

Here, we see Weave has automatically logged our prompt, image, and the response from our model. This is super useful for not only tracking our model in production but also for debugging and improving its performance. By visualizing the inputs and outputs, we can gain deeper insights into how the model interprets and processes images, helping us identify any areas that may need adjustment or further tuning.
This integration streamlines the workflow, making it easier to monitor the model’s performance in real-time and ensuring that any issues are promptly addressed. Furthermore, the detailed logs and visualizations can inform the development and training of future models, offering insights into what works well and what can be improved. Overall, using Claude 3.5 Sonnet in conjunction with Weave significantly enhances our capability to manage, optimize, and evolve image analysis tasks efficiently.
Another "smart" model
It's great to see so much competition in the high-performance LLM space. Just as different people possess unique skills, it's likely that various models will excel in different areas. Claude 3.5 Sonnet, with its extensive 200K token context window and competitive pricing, is a remarkable model. Its ability to handle complex inputs at a lower cost makes it a versatile tool for a range of applications. Whether used alone or in combination with other models like GPT-4, Claude 3.5 Sonnet's capabilities can enhance workflows, offering tailored solutions that leverage the strengths of multiple advanced AI systems.
In comparison to GPT-4o, Claude 3.5 Sonnet is priced slightly cheaper in terms of cost per 1M input tokens (at $3 compared to $5), and also has a longer context window (200k tokens compared to 128k tokens). However, Sonnet is still about twice as slow as GPT-4o in terms of latency, so this is something to keep in mind if quick responses are an integral component of your specific application.
Conclusion
Integrating the Claude 3.5 Sonnet model on Google Cloud’s Vertex AI using Python highlights its practical utility in handling complex and large inputs with its 200K token context window and competitive pricing. The seamless setup with Google Cloud Vertex AI, combined with Weave for real-time monitoring provides valuable insights for quick issue resolution. The model's versatility, whether used alone or with other models like GPT-4, makes it a strong contender in the high-performance LLM space, offering tailored solutions for various applications. This competitive environment fosters continuous improvement, making Claude 3.5 Sonnet an effective tool for optimizing workflows and enhancing tasks.
How to fine-tune Phi-3 Vision on a custom dataset
Here's how to fine tune a state of the art multimodal LLM on a custom dataset
6 "gotchas" in machine learning—and how to avoid them
ML is hard and you can't plan for everything. Here are a few things I've learned and a few tips to avoid common missteps
Grokking: Improved generalization through over-overfitting
One of the most mysterious phenomena in deep learning; Grokking is the tendency of neural networks to improve generalization by sustained overfitting.
Building a RAG-Based Digital Restaurant Menu with LlamaIndex and W&B Weave
Powered by RAG, we will transform the traditional restaurant PDF menu into an AI powered interactive menu!
Claude 3.5 Vertex AI Docs:https://cloud.google.com/vertex-ai/generative-ai/docs/partner-models/use-claude
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.