Skip to main content

The Gemini 2.0 API in Python quickstart tutorial

A quickstart tutorial on how to set up the Gemini 2.0 API with W&B logging in Python.
Created on December 14|Last edited on December 16
This tutorial is meant as a quickstart for those just getting started with the Gemini 2.0 API, specifically those working with it in Python. Based on the popularity of our recent beginner-level piece, Setting up GPT-4o in Python using the OpenAI API, we thought it might be worth taking the time to go the same here.
We'll be following up with some task-specific tutorials, but hopefully this intro will get you started and exploring on your own.
If you're rather see how it works prior to going through these steps, you can run through it quickly in this Colab:


For the rest, here's what we'll be covering:

Table of contents



If you're REALLY just getting started and don't yet have your machine set up to run Python, I've created a quick tutorial here that will have you up and running in just a few minutes.
💡
All ready? Let's dive in.

A bit about Gemini 2.0

Gemini 2.0 represents a significant advancement in Gemini's capabilities, designed for improved reasoning, planning, and tool integration.
Here are the key updates and capabilities:
  • Multimodal Capabilities: Gemini 2.0 supports both input and output across text, images, video, and audio. This includes the ability to generate native images, multilingual text-to-speech audio, and combined outputs for tasks requiring integrated multimodal content.
  • Improved Performance: The Gemini 2.0 Flash experimental model offers faster response times and improved accuracy over its predecessors, with enhanced capabilities for reasoning, coding, and spatial understanding. This makes it particularly suited for dynamic and real-time applications.
  • Tool Integration: Gemini 2.0 introduces native tool-use capabilities, allowing seamless integration with services like Google Search and third-party APIs. This enables automated fact-checking, code execution, and custom tool calls directly within your applications.
  • Expanded Developer Tools: The new Multimodal Live API supports real-time audio and video streaming inputs, making it possible to build interactive, multimodal applications. This API also enables combining multiple tools for complex workflows.
  • Advanced Features for Developers: Gemini 2.0 includes experimental coding agents, like "Jules," which can assist with bug fixes, plan multi-step coding tasks, and prepare pull requests. These features are integrated with GitHub workflows to streamline development tasks.
And here are some of the key details to be aware of as a developer - as of December 11, 2024:
  • Token limits: 1,048,576 (input), 8,192 (output)
  • Rate limits: 10RPM, 4 million TPM, 1,500 RPD
  • Inputs: Audio, images, video, text
  • Output: only text as of this writing, but audio and images coming soon.
Alright, let's get started using it.

Getting your Gemini 2.0 API key

The first thing you're going to need to do is get your Gemini API key. To get that, simply head over to https://ai.google.dev/.
Click the "Sign in" button, toward the top right, and sign into your Google account:

Next, you'll land on a page that looks like this:

From here you'll click the "Get API key" to the top left.
And then "Create API key":

Choose an existing project or create one, and you'll be ready to go:

With that in hand, let's jump into using the API.

Quickstart tutorial for Gemini 2.0 in Python

This tutorial assumes you've installed Jupyter Notebooks. If you've never used Colabs or Jupyter before, there's a nice quickstart here that will only take a few minutes to get through.
Once that's done, it's actually time to dive in. Open up Jupyter. The code below will carry you the rest of the way to getting started with Gemini 2.0 in Python.

Step 1: Installing Gemini 2.0 and W&B Weave

In this project, we'll utilize W&B Weave to log and track the performance of our Gemini 2.0 API in a production environment. Weave allows us to automatically capture the inputs, outputs, and various metrics of our functions, facilitating detailed monitoring and analysis. By decorating our functions with @weave.op(), we can seamlessly log data and visualize the performance of our text generation tasks.
When deploying and maintaining our Gemini models, W&B Weave helps us visualize the effectiveness of the generative model's responses in real-world scenarios. By logging the inputs (queries) and outputs (responses) of our text generation function, we can easily track and monitor different runs. It's as simple as adding the @weave.op() decorator to the function you are using your model in, and Weave automatically logs all inputs and outputs to the function!
This is a good time to sign up for a free Weights & Biases account. It will help keep you from having to interrupt your workflow in a couple minutes, when you're further along.
💡
The code:
!pip install -q -U google-generativeai
!pip install --upgrade weave
When you run the Jupyter cell you'll notice that the space between the [ ] gets filled with an asterisk ([*]). This means the cell is running, and you need to wait for the * to turn into a number before continuing.
Now you've installed what we need, let's import so we can actually use our tools.
If you're new to Python, basically when we install a library we simply grab the code. When we import it, we make that library available for use.
💡

Step 2: Import the libraries and pass the Gemini 2.0 API key

The next step is to import the required libraries, as well as pass the API key across to our friends at Google so we have access to Gemini 2.0.
In this block, we'll import the libraries we've installed as well as a couple others. They are:
  • weave: For logging our prompts and responses to W&B Weave.
  • generativeai: The Google Generative AI library. The reason you're here, we assume!
Here's the code:
import google.generativeai as genai
import weave

def format_res(text):
return text.replace('•', ' *')

Step 3: Add your API Key

Now we just have to configure things so that Gemini 2.0 knows your API key.
As I don't like leaving my key lying around, we're going to add an input for you to paste it in. It will be printed back for confirmation, but that line can be removed if you're in a public environment and worried about your key.
gemini_api_key = input ("What is your Gemini API key? :")
genai.configure(api_key=gemini_api_key)
print(gemini_api_key)

Step 4: Add the query

Next, let's add a request. We could have baked this in, but being a beginner tutorial we thought it would be fun to add it as an input to let you play with different queries more quickly and easily (or at least more easily) while also getting an idea about how all this functions together:
query = input ("What content would you like me to produce ? :")
print(query)
It will print the request you've put in for you to double-check. It's a rather small input window :)

Step 5: Choose your Gemini 2.0 model

There is currently only one Gemini 2.0 model you can choose from.
For this reason, we're using gemini-2.0-flash-exp but there will certainly be other versions available soon. Weave supports both text and images, so once the model goes multimodal, don't hesitate to try Weave for those applications.
Gor now, the code is:
model = genai.GenerativeModel('gemini-2.0-flash-exp')
You can check for all available models at https://ai.google.dev/gemini-api/docs.

Step 6: Name your Weights & Biases project

You're going to see the top response for the request you entered above displayed in your notebook, but you'll have an opportunity below to tell Gemini 2.0 to create multiple options.
project_name = input("What would you like to name your Weave project? :")
wandb_name = project_name.replace(" ", "-") # Replace spaces with dashes in name
weave.init(wandb_name)
Remember to wait for the asterisk to change into a number before moving on. There's more going on here than just picking a name. That project has to be created in Weights & Biases and while it's fast, it's not instantaneous.
You'll end up with something that looks like:


Step 7: Generate text with Gemini 2.0

And now it's time to make some magic.
In the 6th line below, you'll notice that you can change the number of responses you want Gemini to create (right now, we're defaulting to 1). Depending on the content request, you may want to change this number.
When you run this code:
@weave.op()
def generate_text_with_gemini(model, query):
response = model.generate_content(query)
return response.text

num_responses = 1 # How many responses would you like to log to W&B?
for i in range(num_responses):
# Generate text with Gemini 2.0 and log using Weave
response_text = generate_text_with_gemini(model, query)
# Display primary response
print(format_res(response_text))
...you'll get an output like this:

Here we print all of the responses from the model. If we go to our project in Weights & Biases, and then navigate to the Weave tab, we will see our generate_text_with_gemini function and all of the inputs and outputs for the function from our run. This allows us to view the function calls, which enables easy tracking of how our LLM is performing in production.
You can also save multiple commands to the same projects to compare them. You can interact with them easily in Weights & Biases and it will look something like this:

Each of these cells represents a run of our function. Now we can click into each cell and view responses for our query. Weave logs inputs and outputs to our function, as shown below, which allows for seamless tracking of LLM behavior:

If you're interested, you can also dig easily into the "Summary" section inside weave to view more metrics on our function:


Conclusion

I hope you've gotten to the end of this Gemini 2.0 tutorial with more confidence in how to work with the API in Python and some ideas of how you'd like to use it moving forward.
I'll be creating additional tutorials, to address specific tasks, as will other folks here at Weights & Biases. If you have any suggestions, question or more, feel free to drop them in the comments below.


Iterate on AI agents and models faster. Try Weights & Biases today.