Skip to main content

NBC Universal Weave Onboarding Guide

Created on April 30|Last edited on May 2
Access Weights & Biases here: https://mural.wandb.io/
For Any Questions, post them on the #wandb-nbc Slack channel


Weights and Biases (W&B) 💫

Weights and Biases is an MLOps and LLMOps platform built to facilitate collaboration and reproducibility across the machine learning development lifecycle. Machine learning projects can quickly become a mess without some best practices in place to aid developers and scientists as they iterate on models and move them to production.
W&B is lightweight enough to work with whatever framework or platform teams are currently using, but enables teams to quickly start logging their important results to a central system of record. On top of this system of record, W&B has built visualization, automation, and documentation capabilities for better debugging, model tuning, and project management.


Access to W&B

If you do not already have a W&B account please ask an admin to send you an invite.
Our suggestion is to create a team that you can then log projects to. This will need to be done by an admin.

Environment Variables

It will make things easier to set the following environment variables. This will save you having to specify the host and key as detailed further down in this document.
os.environ["WANDB_BASE_URL"] = "https://mural.wandb.io/"
os.environ["WANDB_API_KEY"] = "your-wandb-api-key"
Of course you may need quotes depending how you set these. Your API key is available at https://mural.wandb.io/authorize

W&B Installation & Authentication

To start using W&B Weave, you first need to install the Python package (if it's not already there)
pip install weave
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token (The API token is in your "Settings" section under your profile)
wandb login --host <YOUR W&B HOST URL> <YOUR API TOKEN>
OR through Python:
wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))
In headless environments, you can instead define the WANDB_API_KEY environment variable.
Once you are logged in, you are ready to track your workflows!

Track and evaluate GenAI applications via W&B Weave



Weave is a lightweight toolkit for tracking and evaluating GenAI applications
The goal is to bring rigor, best-practices, and composability to the inherently experimental process of developing GenAI applications, without introducing cognitive overhead.

Weave can be used to:
  • Log and debug model inputs, outputs, and traces
  • Build rigorous, apples-to-apples evaluations for language model use cases
  • Capture valuable feedback that can be used to build new training and evaluation sets
  • Organize all the information generated across the LLM workflow, from experimentation to evaluations to production
A quick-start guide to weave can be found here.

Get started with W&B Weave - Basic Tracing

Once you have authenticated with W&B, you can start by creating a Weave project with the following command
import weave
weave.init('<entity-name>/<project-name>')
# this ensures the project is created in the relevant team.
# If you have WANDB_ENTITY and WANDB_PROJECT set as env vars then you won't need to specify these.
# You may need to create the team.
Now you can decorate the functions you want to track by adding this one line decorator weave.op() to your functions.
Here's what an example script would look like (feel free to copy paste this in your IDE and run this script)
import weave
from openai import OpenAI

client = OpenAI()

# Weave will track the inputs, outputs and code of this function
@weave.op()
def extract_dinos(sentence: str) -> dict:
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": """In JSON format extract a list of `dinosaurs`, with their `name`,
their `common_name`, and whether its `diet` is a herbivore or carnivore"""
},
{
"role": "user",
"content": sentence
}
],
response_format={ "type": "json_object" }
)
return response.choices[0].message.content


# Initialize the weave project
weave.init('<team-name>/jurassic-park')

sentence = """I watched as a Tyrannosaurus rex (T. rex) chased after a Triceratops (Trike), \
both carnivore and herbivore locked in an ancient dance. Meanwhile, a gentle giant \
Brachiosaurus (Brachi) calmly munched on treetops, blissfully unaware of the chaos below."""

result = extract_dinos(sentence)
print(result)

Integrations

Weave provides automatic logging integrations for popular LLM providers and orchestration frameworks. These integrations allow you to seamlessly trace calls made through various libraries, enhancing your ability to monitor and analyze your AI applications, even without explicitly using the weave.op() decorator. We integrate with the following LLM Providers & Frameworks:
  • Amazon Bedrock
  • Anthropic
  • Cerebras
  • Cohere
  • Google
  • LiteLLM
  • MistralAI
  • OpenAI
  • OpenAI Agents SDK
  • LangChain
  • LlamaIndex
And more! Read the full list here

Example Integration Usage

Here, we're automatically tracking all calls to OpenAI.
from openai import OpenAI

import weave

weave.init(PROJECT)

client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": "You are a grammar checker, correct the following user input.",
},
{"role": "user", "content": "That was so easy, it was a piece of pie!"},
],
temperature=0,
)
generation = response.choices[0].message.content
print(generation)


Spinning up an Evaluation [in progress]

Evaluation-driven development helps you reliably iterate on an application. The Evaluation class is designed to assess the performance of a Model on a given Dataset or set of examples using scoring functions.
See a preview of the API below:
import weave
from weave import Evaluation

# Define any custom scoring function
@weave.op()
def exact_match(expected: str, output: dict) -> dict:
# Here is where you'd define the logic to score the model output
return {"match": expected == output}


# Score your examples using scoring functions
evaluation = Evaluation(
dataset=dataset, # can be a list of dictionaries or a weave.Dataset object
scorers=[exact_match], # can be a list of scoring functions
)

# Start tracking the evaluation
weave.init(PROJECT)
# Run the evaluation
print(asyncio.run(evaluation.evaluate(corrector)))

# if you're in a Jupyter Notebook, run:
# await evaluation.evaluate(corrector)
Follow the Build an Evaluation pipeline tutorial to learn more about Evaluation and begin iteratively improving your applications.


Tracking Objects

Prompt Tracking

Tracking Models

Tracking Datasets

Retrieve Published Objects & Ops

Adding Programmatic Feedback

Add human annotations

FAQs

The following page provides answers to common questions about Weave tracing.

What information does Weave capture for a function?

How can I disable code capture?

How can I disable system information capture?

How can I disable client information capture?

Will Weave affect my function's execution speed?

How do I render Markdown in the UI?

Abraham Leal
Abraham Leal •  
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token (The API token is in your "Settings" section under your profile)wandb login --host <YOUR W&B HOST URL> <YOUR API TOKEN>OR through Python:wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))In headless environments, you can instead define the WANDB_API_KEY environment variable.Once you are logged in, you are ready to track your workflows! i believe this is mostly optional with weave?
Reply
Abraham Leal
Abraham Leal •  
pip install wandb weave wandb is a dependency of weave
Reply
Abraham Leal
Abraham Leal •  
Environment Variables We sure this section is correct? I think some of these vars do nothing on weave. Only know of base url and api key to matter, and the env var docs page doesnt list the others
1 reply