W&B Weave updates: Track custom usage and costs, new RAG course, and more
This edition of the W&B Weave newsletter comes complete with our latest features, our latest course, and our first ever hackathon
Created on September 25|Last edited on September 25
Comment
We just wrapped up our San Francisco W&B Weave hackathon where we saw a ton of great LLM evaluation projects from really talented engineers. We’ll have more on that soon, but we wanted to share the improvements that we’ve made to W&B Weave over the past two weeks, plus introduce our newest course on RAG.
But as always, we’ll kick off with our tip of the week:
LLM tip of the week
With the release of OpenAI’s o1-preview and mini models, we all need to update our prompting strategies. o1 has no system prompt and works best with simple and direct prompts (that means no chain-of-thought and xml tags necessary). OpenAI shared some helpful specifics on how reasoning works for o1.
Courses
Our newest course, RAG++: From POC to Production, is now live in the AI Academy. We’ve distilled lessons from 18 months of running our own customer support bot into a one-hour course. Learn how to increase reliability, evaluate systematically, and use the latest RAG techniques. We’ve also partnered with Cohere to offer free LLM credits for participants. And, as always, the course is completely free.
Register for our RAG++ course
Product news
Custom cost tracking
Along with automatically tracking usage and cost from LLM providers, you can now easily track custom usage and costs for any LLM using the add_cost method. In addition, you can query project-level costs using the new APIs.
import weavefrom datetime import datetimeclient = weave.init("my_custom_cost_model")client.add_cost(llm_id="your_model_name",prompt_token_cost=0.01,completion_token_cost=0.02)client.add_costs({llm_id="your_model_name",prompt_token_cost=10,completion_token_cost=20,# If for example I want to raise the price of the model after a certain dateeffective_date=datetime(2025, 4, 22),)
Popular blogs
Story illustration
We got some great feedback about our story illustration blog last week so we decided to make it free using Flux and GPT-4 until Friday. Just give it your favorite story and get beautiful, consistent illustrations. There’s a Colab in the blog that will walk you through the whole process.
o1 benchmarking
We benchmarked OpenAI’s new o1-preview against the AI Hacker Cup challenges and saw a substantial lift in performance on these extremely difficult coding challenges while consuming 2x the number of tokens.
How to train and evaluate an LLM router
Not every part of every problem requires a bleeding-edge model. You can save on time—and cost—by building an LLM router that directs different queries to individual models based on their complexity. This blog explores how to build an LLM router and how you can evaluate response quality with W&B Weave.
Events
Fully Connected Tokyo
Fully Connected is a conference for the builders pioneering the generative AI industry. Learn from foundation model builders, enterprises fine-tuning LLMs, and developers deploying GenAI applications. We hope you can join us on October 10th in Tokyo.
GenAI salon
Join us in person in San Francisco with Jerry Liu (CEO of LlamaIndex) to hear about the building blocks of advanced research assistants. We’ll also be hosting Ben Firshman (Founder & CEO at Replicate) who will dig into the Replicate story, focusing on two main concepts: how he and the Replicate team grew their business to millions of users, and what kinds of products and projects they are seeing people build with AI. Come for the event, stay for the happy hour afterwards. We kick off October 17th at 5pm PT.
Community
Roast my docs
From will-wright-eng, this repo uses LLMs to judge the activation energy for your project, based on your docs. You can try it here.
Judgment day hackathon
The first ever Weights & Biases San Francisco hackathon happened over the weekend, complete with an LLM-powered knowledge graph (featuring prolog), a great prompt optimizer app, a study into measuring creativity of LLMs, and doing pairwise model evaluation on LLM-generated jokes. Check out this recap video from Alex Volkov.
Need help getting started with W&B Weave?
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.