Introducing W&B Prompts

Learn all about W&B's new LLMOps tools and how to use them
Created on April 20|Last edited on June 2
Comment
﻿
IntroductionWe are very excited to announce the release of W&B Prompts, a new suite of LLMOps tools. You can try Prompts here.
﻿
We’re also delighted to announce other features to support the new prompt engineering use case:
A one-line LangChain integration for effortless logging of LangChain models, inputs and outputs
A one-line OpenAI integration to log OpenAI model inputs and outputs
A W&B Launch integration with OpenAI Evals for quick and frictionless evaluation
Improved handling of text in W&B Tables including markdown rendering, long text scrolling and string diffs
A new W&B JavaScript SDK to track prompt exploration as well as model generations in production
Finally, to contribute to and grow the LLM ecosystem we’re also:
Publishing a new whitepaper on the best practices for training LLM﻿
Releasing two new LLM courses focused on fine-tuning and LLMOps﻿
Open sourcing WandBot, a GPT4-powered support bot we’ve been running in production (you can read more about how we made it here) 
W&B Prompts - LLM Debugging ToolsThe prompt engineering use case has unique needs and pain points, slightly different from traditional ML practitioners: 
Compose and interact with different model topologies 
Iterate on conceptual variables like prompts and context
Drive up quality and acceptance rates of responses, rather than driving down loss metrics
Instead of debugging an out-of-control gradient in a deep neural network, they are debugging the program of an LLM
To support this new prompt engineering use case, W&B records execution traces and tracks all users experimentation activity; without W&B, this information is often inaccessible or even lost. This allows you to easily review past results, identify and debug errors, glean insights about model behavior, and share learnings with colleagues. 
The Trace Timeline provides a graphical view of every step and activity throughout the trace of your program, including execution traces and internal component interactions. Click into each part of the visualization to drill down into that specific component and understand what happened and where errors might have occurred. Trace Table provides a holistic view of all your traces, with details on inputs, outputs, chains and errors. Easily export tables to collaborate and iterate intelligently in real time. 
﻿
To do prompt engineering, you need to understand how chain components were set up as part of your iteration and analysis. The Model Architecture viewer provides a detailed description of all the settings, tools and agents within the topology of the chain. 
Try Prompts﻿
﻿
OpenAI Evals with W&B LaunchEvaluating large language models for your own use cases is one of the most challenging, ambiguous, and fast-evolving parts of these new LLM-driven workflows.
﻿OpenAI Evals is a fast-growing repository of dozens of evaluation suites for LLM evaluation. By using W&B Launch, users can easily run any evaluation from OpenAI Evals with just one click, and visualize and share results using W&B. 
﻿Launch packages up everything you need to run the evaluation job - no more worrying about cloning repos, configuring Python environments, or installing dependencies. W&B automatically logs the evaluation in W&B Tables and generates a report, with a chat representation of system prompts, user prompts, and responses. Users can also set up triggers to automatically run evaluations when you push new model versions to the Model Registry. 
Try the W&B OpenAI Evals integration﻿
﻿
W&B Tables - UX Improvements﻿W&B Tables has been a core part of the W&B platform, giving users an easy way to visualize and analyze their machine learning model predictions and underlying datasets. To better support our users working with text data, we’ve made several improvements to how we display text in Tables:
Markdown: Users can now visualize Markdown in tables
Diffing between strings: Users can now display the diff between 2 strings, to quickly see the differences in their LLM prompts
Long-form content: Tables now provides better support for long text fields, with scrolling in cells as well as string pop-ups on hover
﻿
﻿
W&B JavaScript SDKThe Weights & Biases you love is now in Javascript! With our new W&B Javascript SDK, users can easily trace Langchain by adding just a couple of lines to a script. As the chain is being executed, W&B will capture each of the steps taking place in the chain, and visualize them in the W&B UI. This should greatly simplify the process for this new wave of ML developers who are more familiar with working in Javascript. 
﻿
﻿
﻿
Add a comment
Tags: Articles, W&B Features, LLM
Iterate on AI agents and models faster. Try Weights & Biases today.