Skip to main content
W&B Models
Build & fine-tune models
W&B Weave
Develop AI apps

Quickstart: Creating your first trace in W&B Weave

Weave is a toolkit for developing AI-powered applications.

You can use Weave to:
  • Log and debug language model inputs, outputs, and traces
  • Build rigorous, apples-to-apples evaluations for language model use cases
  • Organize all the data generated across the LLM workflow, from experimentation to evaluation to production
Start by logging a trace—either from the playground or right from your code. Visit our documentation to learn more.
  1. Set up the Weave library
  2. Install the CLI and Python library for interacting with Weave and Wandb.

    pip install wandb weave

    Next, login to wandb and paste your API key when prompted.

    wandb login

    You can also set your API key with the following environment variable.

    import os os.environ['WANDB_API_KEY'] = 'your_api_key'
  3. Log a trace with code or
  4. Start tracking inputs and outputs of functions by decorating them with weave.op().
    Run this sample code to see the new trace.

    In this example, we're using a generated OpenAI API key which you can find here.
    Using another provider? We support all major clients and frameworks.

    # Ensure your OpenAI client is available with: # pip install openai # Ensure that your OpenAI API key is available at: # os.environ['OPENAI_API_KEY'] = "<your_openai_api_key>" import os import weave from openai import OpenAI weave.init('events/SLURM') # 🐝 @weave.op() # 🐝 Decorator to track requests def create_completion(message: str) -> str: client = OpenAI() response = client.chat.completions.create( model="gpt-5", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": message} ], ) return response.choices[0].message.content message = "Tell me a joke." create_completion(message)
  5. Running your first evaluation
  6. Evaluate a simple JSON QA task with Weave evaluators. Pick your provider below and run the snippet to log results.

    In this example, we're using a generated OpenAI API key which you can find here.
    Using another provider? We support all major clients and frameworks.

    # Ensure your OpenAI client and pandas is installed with: # pip install openai pandas # Ensure that your OpenAI API key is available at: # os.environ['OPENAI_API_KEY'] = "<your_openai_api_key>" import asyncio import os import re from textwrap import dedent import openai import weave class JsonModel(weave.Model): prompt: weave.Prompt = weave.StringPrompt( dedent(""" You are an assistant that answers questions about JSON data provided by the user. The JSON data represents structured information of various kinds, and may be deeply nested. In the first user message, you will receive the JSON data under a label called 'context', and a question under a label called 'question'. Your job is to answer the question with as much accuracy and brevity as possible. Give only the answer with no preamble. You must output the answer in XML format, between <answer> and </answer> tags. """) ) model: str = "gpt-4.1-nano" _client: openai.OpenAI def __init__(self): super().__init__() self._client = openai.OpenAI() @weave.op def predict(self, context: str, question: str) -> str: response = self._client.chat.completions.create( model=self.model, messages=[ {"role": "system", "content": self.prompt.format()}, { "role": "user", "content": f"Context: {context}\nQuestion: {question}", }, ], ) assert response.choices[0].message.content is not None return response.choices[0].message.content @weave.op def correct_answer_format(answer: str, output: str) -> dict[str, bool]: parsed_output = re.search(r"<answer>(.*?)</answer>", output, re.DOTALL) if parsed_output is None: return {"correct_answer": False, "correct_format": False} return {"correct_answer": parsed_output.group(1) == answer, "correct_format": True} if __name__ == "__main__": if not os.environ.get('OPENAI_API_KEY'): print("OPENAI_API_KEY is not set - make sure to export it in your environment or assign it in this script") exit(1) weave.init("events/SLURM") jsonqa = weave.Dataset.from_uri( "weave:///wandb/json-qa/object/json-qa:v3" ).to_pandas() model = JsonModel() eval = weave.Evaluation( name="json-qa-eval", dataset=weave.Dataset.from_pandas(jsonqa), scorers=[correct_answer_format], ) asyncio.run(eval.evaluate(model))
    Show more
  7. Get started with Playground
  8. You can interactively develop, review, and test their prompts using our LLM playground which supports all major model providers.