Building a Virtual Assistant with Google Gemini Function Calling
This article delves into the challenges and solutions for enabling Google Gemini to provide up-to-date information, such as sports schedules, by interfacing with external APIs.
Created on December 20|Last edited on December 26
Comment
Large Language Models like Gemini have become increasingly adept at understanding and processing human language. However, a critical limitation of these models lies in their inability to access real-time information. Trained on extensive, but static datasets, LLMs excel in analyzing and generating content based on historical data. Yet, when it comes to real-time updates—be it the latest sports scores, stock market changes, news, or weather forecasts—they fall short. This limitation is where the concept of function calling comes into play, transforming LLMs into tools capable of fetching and integrating current data into their responses.

What We'll Cover
Getting Started With The Gemini APIThe Realtime API Code Function Calling With The Gemini APIEvaluating Performance with W&B In Closing
In this tutorial we'll be developing a virtual assistant that can provide real-time game schedules for various sports. The challenge here is twofold:
1. Interpreting User Queries: Users might ask about game times in a very casual or unstructured way, like "What time do the Chiefs play?" An LLM like Gemini can interpret this natural language query, understand the intent, and identify key information (e.g., the team name "Chiefs" and the sport is "Football").
2. Fetching Real-Time Game Information: Once the LLM has interpreted the query, it needs to fetch the actual game time. However, since the LLM itself doesn’t have real-time data, it uses function calling to interact with a sports API that provides current game schedules. The LLM, after determining the sport and team from the query, can call the appropriate function with these parameters to retrieve the latest game time.
In essence, the LLM serves as an intelligent intermediary. It understands the user's natural language request and translates it into a structured query that can be processed by an external tool or API capable of providing real-time information. This synergy of AI's interpretive power and external data sources' current information makes the virtual assistant highly effective and user-friendly.
Getting Started With The Gemini API
To use Gemini, we need to set up our environment in Google Cloud Platform (GCP). The key steps include:
1. Enable the Gemini API: You'll need to access the Google Cloud Console, navigate to the API library, and enable the Gemini API for your project.
2. Python package install: Install the Google Cloud AI Platform Pip package
pip install --upgrade google-cloud-aiplatform
3. Obtain Credentials: For authentication, you need to create a service account in GCP and download the credentials file. This file will include various keys and IDs required for programmatic access.
Our initialization code looks like:
from vertexai.preview import generative_modelsfrom vertexai.preview.generative_models import GenerativeModelfrom google.cloud import aiplatformfrom google.oauth2 import service_accountcred = {"type": "service_account","project_id": "[Your Google Cloud project ID]","private_key_id": "[Unique identifier for the private key]","private_key": "-----BEGIN PRIVATE KEY-----\n[Your private key here]\n-----END PRIVATE KEY-----","client_email": "[Service account email address]","client_id": "[Service account client ID]","auth_uri": "https://accounts.google.com/o/oauth2/auth","token_uri": "https://oauth2.googleapis.com/token","auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs","client_x509_cert_url": "[URL to the service account's public x509 certificate]"}credentials = service_account.Credentials.from_service_account_info(cred)aiplatform.init(project='your project id',location='region. eg: us-central1', # change this if you're using a different regionstaging_bucket='your storage bucket',credentials=credentials,)# Initialize the Vertex AI SDKaiplatform.init(project='dsports-6ab79',location='us-central1', # change this if you're using a different regionstaging_bucket='gs://artifacts.dsports-6ab79.appspot.com',credentials=credentials,)
The Realtime API Code
In order to build our virtual assistant, we will start by building some functions that are capable of fetching the times of games. I was able to use the ESPN API to do this, and below is the code for this:
api_urls = {"football": "http://site.api.espn.com/apis/site/v2/sports/football/nfl/scoreboard","basketball": "http://site.api.espn.com/apis/site/v2/sports/basketball/nba/scoreboard","baseball": "http://site.api.espn.com/apis/site/v2/sports/baseball/mlb/scoreboard"}leagues = {"football": "nfl","basketball": "nba","baseball": "mlb"}def get_events(sport):""" Get all event IDs for a given sport """url = api_urls.get(sport)if not url:return "Sport not supported"response = requests.get(url)if response.status_code != 200:return "Failed to fetch data"events = response.json().get('events', [])event_ids = [event['id'] for event in events]return event_idsdef get_event_info(sport, all_events_in_sport, teamName):""" Get description of each event given a list of its IDs and sport """event_info_dict = {}for event_id in all_events_in_sport:# Constructing the URL based on the sport and event IDurl = f"https://sports.core.api.espn.com/v2/sports/{sport}/leagues/{leagues[sport]}/events/{event_id}"response = requests.get(url)if response.status_code != 200:event_info_dict[event_id] = "Failed to fetch event data"continueevent_data = response.json()# Extracting the name of the event which includes team namesevent_name = event_data.get('name', 'Unknown Event')# Extracting the date and time of the eventdate_str = event_data.get('date', '')if date_str:date_str = date_str.replace('Z', '+00:00')game_datetime = datetime.datetime.fromisoformat(date_str)formatted_time = game_datetime.strftime("%B %d, %Y at %I:%M %p UTC")else:formatted_time = "Time not available"event_info_dict[event_id] = {"description": f"{event_name}","time": f"{formatted_time}",}for ev_id in event_info_dict.keys():if teamName.lower() in event_info_dict[ev_id]['description'].lower():return event_info_dict[ev_id]['time']return {}
Above, we create a few functions, which when used together, can get the time of a game given the team name and sport. In order to do so, we first will need to fetch all of the 'events' given a certain sport. Then, we can loop through each event to find an event containing our given team name. Here is the function that combines the two above functions to get the the game time:
def getGameTime(sport, teamName):event_ids = get_events(sport)event_info = get_event_info(sport, event_ids, teamName)return event_info
This function will act as our "tool," which when supplied the correct parameters, will return the game time for a specified team and sport.
Function Calling With The Gemini API
Now that we have built out our tool function, we can now move on to function calling with the Gemini API. Here is the code:
from vertexai.preview import generative_modelsfrom vertexai.preview.generative_models import GenerativeModelimport wandbfrom google.cloud import aiplatformfrom google.oauth2 import service_accountimport jsonmodel = GenerativeModel("gemini-pro")get_game_time_func = generative_models.FunctionDeclaration(name="get_game_time_from_query",description="Determine the game time from a query, given a sport. Supported sports are football, basketball, and baseball.",parameters={"type": "object","properties": {"teamName": {"type": "string","description": "The name of one of the teams mentioned in the query, for example 'chiefs' or 'cardinals'"},"sport": {"type": "string","enum": ["football", "basketball", "baseball"],"description": "The sport to search in. "}},"required": ["query", "sport"]},)gametime_tool = generative_models.Tool(function_declarations=[get_game_time_func])# Example queryq = "what time do the chiefs play?"# Use Gemini model to determine the sport from the query -> gets the params (sport and team name)model_response = model.generate_content(q,generation_config={"temperature": 0},tools=[gametime_tool],)args = model_response.candidates[0].content.parts[0].function_call.args.pb# call our getGameTime function given our responseprint(getGameTime(args.get("sport").string_value, args.get("teamName").string_value))
We start by initializing our model, followed by creating an instance of a functionDeclaration which contains a formal description of our getGameTime function. The parameters contained below let the Gemini API know that we are generating parameters for our getGameTime function, which include a team name as well as the sport. In addition, we include the data type and description of the parameter, which guide the model when generating the response.
parameters={"type": "object","properties": {"teamName": {"type": "string","description": "The name of one of the teams mentioned in the query, for example 'chiefs' or 'cardinals'"},"sport": {"type": "string","enum": ["football", "basketball", "baseball"],"description": "The sport to search in. "}},"required": ["query", "sport"]},
Evaluating Performance with W&B
In order to validate the performance of our system, I've create a small dataset of a few realistic queries that the system will need to handle.
Here is the dataset:
queries_and_truths = [# Football queries("what time do the chiefs play?", ("football", "Chiefs")),("when is the next packers game?", ("football", "Packers")),("time for the broncos game", ("football", "Broncos")),("dallas cowboys game schedule", ("football", "Cowboys")),("new england patriots next game", ("football", "Patriots")),("seahawks game start time", ("football", "Seahawks")),("buccaneers upcoming game", ("football", "Buccaneers")),("49ers game tonight", ("football", "49ers")),# Basketball queries("lakers game tonight?", ("basketball", "Lakers")),("heat game start time", ("basketball", "Heat")),("warriors next game time", ("basketball", "Warriors")),("celtics game schedule", ("basketball", "Celtics")),("bucks next game", ("basketball", "Bucks")),("suns game time", ("basketball", "Suns")),# Baseball queries("yankees game today", ("baseball", "Yankees")),("dodgers game start time", ("baseball", "Dodgers")),("time for braves next game", ("baseball", "Braves")),("mets game schedule", ("baseball", "Mets")),("cubs game tonight", ("baseball", "Cubs")),("astros next game", ("baseball", "Astros"))]
Clearly this is a very small test set, but it will suffice for demonstration purposes. We will loop over this dataset, and verify that the model is predicting the correct function parameters given the query. Here is code for testing, which also utilizes W&B tables and charts to log the results:
results_table = wandb.Table(columns=["Query", "Predicted Sport", "Predicted Team", "Ground Truth Sport", "Ground Truth Team"])# Function to check accuracydef check_accuracy(predicted, truth):return 1 if predicted.lower() == truth.lower() else 0# Process each querycorrect_sports, correct_teams, total_queries = 0, 0, len(queries_and_truths)for query, (truth_sport, truth_team) in queries_and_truths:# Call the model for inferencemodel_response = model.generate_content(query,generation_config={"temperature": 0},tools=[gametime_tool],)args = model_response.candidates[0].content.parts[0].function_call.args.pbpredicted_sport = args.get("sport").string_valuepredicted_team = args.get("teamName").string_value# Calculate accuraciessport_accuracy = check_accuracy(predicted_sport, truth_sport)team_accuracy = check_accuracy(predicted_team, truth_team)correct_sports += sport_accuracycorrect_teams += team_accuracy# Log to wandb tableresults_table.add_data(query, predicted_sport.lower(), predicted_team.lower(), truth_sport.lower(), truth_team.lower())time.sleep(10)# Calculate and log overall accuraciesoverall_sport_accuracy = correct_sports / total_queriesoverall_team_accuracy = correct_teams / total_querieswandb.log({"Overall Sport Accuracy": overall_sport_accuracy, "Overall Team Accuracy": overall_team_accuracy})# Log the table to wandbwandb.log({"Game Time Queries": results_table})
Here, I loop over every example, and use the Gemini model to predict the correct function parameters. Note, I also added a 10 second delay in between each example in order to avoid getting rate-limited by the API. You can probably get away with a shorter delay, but this worked well for this dataset size. Our model only missed one example, where it guessed 'Dallas Cowboys' instead of just 'Cowboys.' A possible solution to this type of error would be to provide more in-context examples in the function parameter description.
Run: revived-shape-6
1
Run: revived-shape-6
1
In Closing
Overall, by enabling LLMs to interact with external APIs and tools we can bridge the gap between their vast static knowledge and the dynamic nature of real-world information. As LLM's get smarter, I think we will eventually be able to supply more and more advanced tools for solving specific types of problems, further boosting AI capabilities. Perhaps what sets humans apart from all species is our prolific ability to use tools, and by providing AI with tools, we get closer and closer to a "super intelligent" system.
If you have any questions or comments, feel free to drop them in the comments below, and also here is the code repo for the project!
Source:
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.