Kaplan PoV Guide

One stop shop for everything you need to test out during the W&B Pilot.
Created on May 2|Last edited on June 4
Comment
﻿
Access Weights & Biases here: https://kaplan.wandb.io/﻿﻿﻿﻿﻿﻿﻿﻿﻿
💡
For Any Questions, please reach out via Slack channel: #wandb-kaplan
💡
Link to Colab Notebook with helpful Weave scripts: https://colab.research.google.com/drive/1xz-OSmvAVXyZGVQXtRkybT8wuyY18zdd?usp=sharing﻿
💡
Weights and Biases (W&B) 💫PoC Workshop SessionsQuick Documentations LinksW&B Installation & AuthenticationSample Script to log First Trace (validated on Kick-off call)Use Cases / Test CasesTest Case 1: Complete Visibility into all input/output calls when building a RAG pipelineTest Case 2: Automate Evaluations with W&B WeaveTest Case 3: Visibility into latency, costs, and token countsTest Case 4: Prompt ManagementTest Case 5: Model Safety with HallucinationsTrack and evaluate GenAI applications via W&B Weave FAQsW&B Weave
﻿
Weights and Biases (W&B) 💫Weights and Biases is a ML Ops platform built to facilitate collaboration and reproducibility across the machine learning development lifecycle. Machine learning projects can quickly become a mess without some best practices in place to aid developers and scientists as they iterate on models and move them to production. 
W&B is lightweight enough to work with whatever framework or platform teams are currently using, but enables teams to quickly start logging their important results to a central system of record. On top of this system of record, W&B has built visualization, automation, and documentation capabilities for better debugging, model tuning, and project management. 
PoC Workshop Sessions
















DateSessionRecording LinkTopics Discussed
May 05, 2025PoV Kickoff Callhttps://us-39259.app.gong.io/e/c-share/?tkn=so3vck3g60fzh8x1ya753z9qAlign on success criteria's and PoV plan, setup team and log a first trace
﻿
Quick Documentations LinksAccess the main Weave documentation page: https://weave-docs.wandb.ai/﻿﻿﻿
Learn how to integrate W&B Weave with your stack: https://weave-docs.wandb.ai/guides/integrations/﻿
W&B Weave Integration with OpenAI: https://weave-docs.wandb.ai/guides/integrations/openai﻿
W&B Weave Integration with Gemini: https://weave-docs.wandb.ai/guides/integrations/google﻿
W&B Weave Integration with Microsoft Azure: https://weave-docs.wandb.ai/guides/integrations/azure﻿
Trace LLM Applications: https://weave-docs.wandb.ai/tutorial-tracing_2﻿﻿﻿
W&B Weave Cookbooks: https://weave-docs.wandb.ai/reference/gen_notebooks/intro_notebook﻿
RAG App Evaluation Tutorial: https://weave-docs.wandb.ai/tutorial-rag﻿
W&B Installation & AuthenticationTo start using W&B, you first need to install the Python package (if it's not already there)
pip install wandb weave
Once it's installed, authenticate your user account by logging in through the CLI or SDK. You should have receive an email to sign up to the platform, after which you can obtain your API token (The API token is in your "Settings" section under your profile)
wandb login --host <YOUR W&B HOST URL> <YOUR API TOKEN>
OR through Python:
WANDB_BASE_URL = "https://kaplan.wandb.io/"﻿
﻿﻿wandb.login(host=os.getenv("WANDB_BASE_URL"), key=os.getenv("WANDB_API_KEY"))
In headless environments, you can instead define the WANDB_API_KEY environment variable.﻿
Once you are logged in, you are ready to track your workflows!
Sample Script to log First Trace (validated on Kick-off call)Following the Quickstart guide on docs but for Kaplan team
Authentication:
!pip install wandb weave
import wandb
import weave
﻿
wandb.login(host="https://kaplan.wandb.io/")﻿
Script Execution:
import weave
from openai import OpenAI
﻿
client = OpenAI()
﻿
# Weave will track the inputs, outputs and code of this function
@weave.op()
def extract_dinos(sentence: str) -> dict:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {
                "role": "system",
                "content": """In JSON format extract a list of `dinosaurs`, with their `name`,
their `common_name`, and whether its `diet` is a herbivore or carnivore"""
            },
            {
                "role": "user",
                "content": sentence
            }
            ],
            response_format={ "type": "json_object" }
        )
    return response.choices[0].message.content
﻿
﻿
# Initialise the weave project
weave.init('kaplan_ai_ml/jurassic-park')
﻿
sentence = """I watched as a Tyrannosaurus rex (T. rex) chased after a Triceratops (Trike), \
both carnivore and herbivore locked in an ancient dance. Meanwhile, a gentle giant \
Brachiosaurus (Brachi) calmly munched on treetops, blissfully unaware of the chaos below."""
﻿
result = extract_dinos(sentence)
print(result)﻿﻿
Use Cases / Test Cases




























S NoCapability & Success Criteria
1Complete Visibility into all input/output calls when building a RAG pipeline
2Automate Evaluations with W&B Weave
3Visibility into latency, costs, and token counts
4Prompt Management
5Model Safety with Hallucinations
﻿
﻿
Test Case 1: Complete Visibility into all input/output calls when building a RAG pipeline
Test Case 2: Automate Evaluations with W&B Weave
Test Case 3: Visibility into latency, costs, and token counts
Test Case 4: Prompt Management
Test Case 5: Model Safety with Hallucinations
Track and evaluate GenAI applications via W&B Weave ﻿
﻿
Weave is a lightweight toolkit for tracking and evaluating GenAI applications
The goal is to bring rigor, best-practices, and composability to the inherently experimental process of developing GenAI applications, without introducing cognitive overhead.
﻿
﻿
Weave can be used to:
Log and debug model inputs, outputs, and traces
Build rigorous, apples-to-apples evaluations for language model use cases
Capture valuable feedback that can be used to build new training and evaluation sets
Organize all the information generated across the LLM workflow, from experimentation to evaluations to production
A quick-start guide to weave can be found here. 
FAQs1. Why can't I login to W&B?
Ans: Make sure you have specified host while initializing a W&B run with the script 
wandb.init(host="https://kaplan.wandb.io/"﻿)﻿
OR set the host base URL as environment variable
os.environ["WANDB_BASE_URL"] = "https://kaplan.wandb.io/"
W&B WeaveThe following page provides answers to common questions about Weave tracing.
1. How does Tracing with W&B Weave work?
﻿This loom video (~4mins) walks through how tracing works with W&B Weave 
2. How can I add a custom cost for my GenAI model?
You can add a custom cost by using the add_cost method. This guide walks you through the steps of adding a custom cost. Additionally we also have this cookbook on Setting up a custom cost model with associated notebook.
3. How can I create my own custom Scorers with W&B Weave?
W&B Weave has it's own predefined scorers that you use as well as create your own Scorers. This documentation walks through creating your own scorers with W&B Weave
4. Can I control/customize the data that is logged?
Yes, If you want to change the data that is logged to weave without modifying the original function (e.g. to hide sensitive data), you can pass postprocess_inputs and postprocess_output to the op decorator.
﻿Here's more details on how to do so
5. How to publish prompts to W&B Weave?
W&B Weave support Prompts as first class object. You can use weave.publish() to log prompts or any object as well (eg: Datasets, Models etc.) to Weave. This guide walks into details on publishing prompts to W&B Weave
﻿
Date	Session	Recording Link	Topics Discussed
May 05, 2025	PoV Kickoff Call	https://us-39259.app.gong.io/e/c-share/?tkn=so3vck3g60fzh8x1ya753z9q	Align on success criteria's and PoV plan, setup team and log a first trace
S No	Capability & Success Criteria
1	Complete Visibility into all input/output calls when building a RAG pipeline
2	Automate Evaluations with W&B Weave
3	Visibility into latency, costs, and token counts
4	Prompt Management
5	Model Safety with Hallucinations
Add a comment