Weave newsletter: Multimodal LLMs, RAG tutorial, and OpenAI DevDay
We're always improving W&B Weave. This week, we're pleased to announce audio file support as well as a host of hands-on events and tutorials
Created on October 23|Last edited on October 23
Comment
Welcome to the latest Weights & Biases GenAI newsletter. This week, we’ve got new audio functionality in W&B Weave, a great RAG tutorial, and many upcoming events in SF, London, and Seattle. But as always, let’s start with a tip:
LLM tip of the week ⭐
Stochasticity in generations is the price of using LLMs. A recent evaluation we ran had the following scores: 78%, 77%, 75%, 84% and 70%. That’s why we recommend you always run multiple trials when evaluating your LLM outputs and average across them to get a better overview of performance. Also consider running evaluations at different LLM temperatures to simulate unexpected inputs in production.
Product news 🚀
📣 Audio formats now supported in W&B Weave
We’re continuing to add modalities to help teams build with W&B Weave. Weave now supports audio file formats as inputs and outputs (as well as within dataset entries), which, alongside support for images and text, gives you the ability to understand multimodal models far better.
To try it out, simply update to the latest Weave SDK. Also, keep an eye out for an upcoming integration with OpenAI’s Realtime API.
🏷 Custom evaluations naming in W&B Weave
We’ve made it easier to organize your GenAI application evaluations with custom naming. Now you can organize and compare your evaluations by setting a display name in the code.
Popular blogs 📑
How to build a RAG system
Based on our RAG++ course, Bharat gives you the most important aspects to consider when trying to move your RAG app beyond a POC. He covers query enhancement, response validation, and making strategic 80/20 decisions about where to invest next.
Building reliable apps with GPT-4o
Brett teaches you how to enforce consistency on GPT-4o outputs and build reliable GenAI apps with three real-world examples: categorizing research papers, structuring restaurant menu data for a RAG system, and converting voice commands.
Events 🏢
🌉 AI Tinkerers - Halloween edition meetup - October 22 at 5:30pm PT in SF
Get ready for a spooktacular gathering of active builders. Come hang out with SF’s top AI engineers, hackers, researchers, and technical founders.
🌉 AI for developers user group meetup - October 23 at 6:30pm PT in SF
Don't miss this chance to network with fellow developers, learn about the latest AI trends, and boost your skills in generative AI development.
🌌 AI Tinkerers meetup - October 24 at 5pm PT in Seattle
We’re heading up the coast to Seattle for this meetup. Join a group of active practitioners who are passionate about developing and implementing AI technologies.
🏰 OpenAI DevDay London pre-party - October 29 at 6:00pm in London
Come hang out ahead of OpenAI’s London DevDay and see demos of what people have built with OpenAI's latest tools.
🌉 AI Tinkerers humans-in-the-loop agents hackathon with Google Cloud @ Weights & Biases - November 2-3 in SF
Hosted by Weights & Biases, this hackathon centers on building cutting-edge human-in-the-loop AI agents.
🌉 GenAI master class: from prototypes to production - November 4th at 5:30pm PT in SF
Join us in person in our San Francisco office for a master class that will equip you with the practical skills to build and deploy GenAI solutions.
Community 💡
SimpleBench, powered by W&B Weave, is a challenging leaderboard for LLMs. Claude Sonnet 3.5 currently tops the rankings at 27%, check it out.
Kollektiv helps LLMs tap into up-to-date information on libraries, tools, and frameworks. Kollektiv will parse the docs of your favorite libraries, store and embed them in local vector storage, and set up an LLM chat to explore them.
Need help getting started with W&B Weave?
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.