Autologgers for LLM API Powered Applications with W&B Prompts

A quick introduction to Weights & Biases autologgers for LLM APIs.
Created on May 30|Last edited on June 15
Comment
﻿
IntroductionIf you're working on an LLM-powered product feature, how much visibility do you have into your model's performance? Do you have an overview of your LLM APIs like the one below? Because understanding your LLM outputs is vital if you want to create a coherent experience for your users. In this report, we'll walk you through how to do just that. 
﻿
runs.summary["stats"]
 - 1 of 1
request
response
model_alias
model
start_time
end_time
request_id
api_type
session_id
elapsed_time
prompt_tokens
completion_tokens
total_tokens
1
﻿
﻿
Why Logging LLM API calls is importantTracking and saving LLM API calls such as those provided by OpenAI, Cohere, Anthropic LM, and the Huggingface Inference API, is significant for two main reasons:
Optimize costs: By watching and analyzing usage patterns we can making informed decisions about when and how much to scale the system. This helps prevent spending on unused resources.
Ensure quality and reliability: No model remains stagnant and keeping a close eye on how these tools are performing lets us catch and fix small problems before they become critical ones. 
While these are the main points, the process of tracking LLM API calls brings other benefits to the table. It enables us to understand popular features, ensure adherence to data security standards, and identify performance bottlenecks.
However, managing these processes manually can be overwhelming. The utilization of a visual dashboard simplifies these tasks and transforms them into an intuitive experience. With such a tool, you're not only tracking but visualizing the data in a way that's far easier to comprehend and act upon.
Using Weights & Biases PromptsThis is where Weights & Biases (W&B) Prompts come into play. It's a comprehensive toolkit designed specifically for the effective development of LLM-powered applications. It provides functionalities for:
Visualizing and examining the execution flow of your LLMs.
Analyzing the inputs and outputs.
Logging token usage and other metadata.
Viewing intermediate results.
Secure management of prompts and LLM configurations.
You can see all these features at work in the example panel for a Langchain Agent shown below. The trace table records the inputs and outputs of the various LLM calls. Moreover, the Trace Timeline provides a graphical representation of the execution flow, while the subsequent tab in the panels enables you to view intermediate results within an agent.
﻿
﻿
W&B Prompts enhances existing W&B tools—namely W&B Experiments and W&B Tables—providing a comprehensive ecosystem for exploring and refining your LLM applications.
By leveraging W&B Prompts, you can not only track and analyze your LLM API calls effectively, but also ensure the smooth operation and optimization of your LLM-powered applications.
Logging LLM APIs using Weights & BiasesNow that we've seen a practical example with the Langchain Agent, here's a quick look at how Weights & Biases caters to broader LLM API providers. 
W&B offers a one line auto-logging integration with many LLM API providers, including OpenAI, Cohere, and Huggingface Pipelines. This feature provides a quick and straightforward method to track your LLM API calls, requiring minimal code while also offering meaningful defaults.
SetupBefore we get into the specifics of using these integrations, let's first ensure that we have the appropriate tools. To enable these integrations, simply install wandb version 0.15.4 or higher by running the following command:
pip install "wandb>=0.15.4"
UsageIncorporating these integrations into your existing application is a breeze. It involves adding just the following two lines of code:
OpenAIfrom wandb.integration.openai import autolog
autolog()
﻿
# your existing openai code, for example
response = openai.ChatCompletion.create(model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Who won the world series in 2020?"},
            {
                "role": "assistant",
                "content": "The Los Angeles Dodgers won the World Series in 2020.",
            },
            {"role": "user", "content": "Where was it played?"},
        ],
    )
Coherefrom wandb.integration.cohere import autolog
autolog()
﻿
﻿
# your existing cohere code. for example:
co = cohere.Client()
﻿
﻿
response = co.generate(
    model="command-light",
    prompt="Once upon a time in a magical land called",
    return_likelihoods="ALL",
    num_generations=2,
    temperature=1,
)
Huggingface Pipelinefrom wandb.integration.huggingface import autolog
autolog()
﻿
﻿
# your existing hf pipeline code, for example:
text_generation_pipeline = pipeline("text-generation")
    results = text_generation_pipeline(["Once upon a time,", "In the year 2525,"])
    print(results)
﻿
These integrations automatically log LLM calls, logs tables, provide visualizations of trace views and usage statistics in your Weights & Biases dashboard.
For example, the above autologger for OpenAI chat completion call generates the following panel:
﻿
﻿
﻿
Customizing your autologgerThe autolog function above takes optional wandb.init arguments. You can use these to configure your project. For instance, to define a project and job type you can do the following.
autolog(project=<your-awesome-project>, job_type=<your-awesome-job-type>)
This automatically logs your metrics, tables and trace views to your custom project with the corresponding job type. 
Disabling your autologgerTo disable the autologger you can either call the autolog.disable() function or call wandb.finish() in your code. Here's an example:
from wandb.integration.openai import autolog
autolog()
### All your code in your awesome LLM powered application
...
# This finishes you wandb run and disables wandb logging
autolog.disable()
ConclusionMonitoring and optimizing your LLM API-powered features are essential for managing your project successfully. With Weights & Biases Prompts, you gain a versatile toolkit that simplifies your development process, from visualizing execution flows to managing prompts securely.
This powerful toolset also includes seamless integration with various LLM API providers, enabling quick and easy tracking of your API calls. Remember, setting up these integrations is as straightforward as installing the latest version of wandb and adding a couple of lines to your existing code. Start exploring and refining your LLM applications with Weights & Biases, and enjoy a more streamlined and productive coding experience.
﻿
Add a comment