OpenAI GPT OSS models on W&B Inference
Get started with GPT OSS models on our free tier.
Created on August 6|Last edited on August 6
Comment
W&B Inference, powered by CoreWeave, provides API and playground access to a growing number of open-source models. As of August 5th, we're pleased to announce that OpenAI’s open-weight models, GPT OSS 120B and GPT OSS 20B, are now available on Weights & Biases.
GPT OSS enables developers to rapidly build agentic AI applications with integrated observability. We're proud to offer day-zero access at some of the industry’s lowest token costs, all running on CoreWeave’s purpose-built AI cloud. Test the new OpenAI models, fully hosted on CoreWeave AI Cloud platform, through the OpenAI API. Rapidly evaluate, monitor, and iterate on your agentic AI applications using integrated W&B Weave tracing, available readily through W&B Inference.

OpenAI’s first open-weight LLMs since GPT-2
We've seen open models perform competitively with name brand models for a while now. But OpenAI’s open-weight models raises the bar: they hit o4-mini and o3-mini–level scores on public benchmarks at a fraction of the cost. Plus, you can fully customize them.
GPT OSS 120B rivals o4-mini on reasoning tests and still runs smoothly on a single 80 GB GPU. GPT OSS 20B matches o3-mini, yet it can run on devices with only 16 GB of RAM—ideal for mobile and on-device use. Both models are good at tool use, follow short prompts with ease, and excel at step-by-step reasoning.
These models bring the true power of billions of dollars in pretraining to everyone:
- 117B params (120b) or 21B (20b)
- MoE uses just 5B / 3.6B active → runs on one H100 or a 16 GB GPU
- Native 128 K context for giant CoT and RAG
- Open weights, Apache 2.0, safety checked
The evals look impressive:

W&B Inference powered by CoreWeave
Skip the hassle of spinning up another model host or managing deployments yourself. With your Weights & Biases account, you get instant access to the new GPT-OSS models (plus other top open-source foundation models) fully hosted on CoreWeave’s powerful infrastructure.
And the best part? Our pricing is among the best in the industry, so you can evaluate and build with these models without burning your budget.
Just sign in to Weights & Biases, pick an OpenAI model from the menu, and start running inference for free in seconds.
Want the easiest way to try them? Jump into the W&B Weave Playground, no endpoints, access keys or setup required. Just run and go. Plus, you get all our tools to compare different models side by side.

If you want to run the new models from your code, just head to the model card and copy the starter code we’ve provided, paste it into your project, and you’re ready to go.

Evaluating GPT OSS for your application
We’ve built a Google Colab notebook you can use to get started with W&B Inference. Take the code and tweak it for your application, then see how the new models perform. As shown in that notebook, you can use W&B Weave to log your eval results and compare these evals side by side while you try different prompts and model configs. Once you're good with the metrics, switch to the new model to upgrade your application to state-of-the-art performance.
Give it a try and let us know what you think!
Getting started
If you are just exploring the GPT OSS models, head over to the W&B Weave Playground. Every Weights & Biases plan includes a free tier of W&B Inference, so you can dive straight in without additional upfront costs. To learn more, see the W&B Inference documentation and the W&B Inference pricing page.
Recommended reading
Tutorial: Fine-tuning OpenAI GPT-OSS
Unlock the power of fine-tuning OpenAI's GPT-OSS models with Weights & Biases. Customize LLMs for your tasks, save costs, and boost performance.
W&B Inference docs
W&B Inference provides access to leading open-source foundation models via W&B Weave and an OpenAI-compliant API.
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.