Available models

W&B Inference powered by CoreWeave provides API and playground access to leading open-source LLMs, including OpenAI GPT OSS, Qwen3, Kimi K2, Llama 4, DeepSeek, and Phi, allowing Weights & Biases users to develop AI applications and agents without needing to sign up for a hosting provider or host models themselves

Z.AI GLM 4.5

Text

Sep 2025
$0.55 input

 / 

$2.00 output
131K
MoE model with user-controllable thinking/non-thinking modes for strong reasoning, code generation, and agent alignment.

Moonshot AI Kimi K2 Instruct 0905

Text

Sep 2025
$0.60 input

 / 

$2.50 output
262,000
Latest version of Kimi K2 mixture-of-experts language model, featuring 32B activated parameters and a total of 1T parameters.

Deepseek V3.1

Text

Aug 2025
$0.55 input

 / 

$1.65 output
128K
A large hybrid model that supports both thinking and non-thinking modes via prompt templates.

OpenAI GPT OSS 20B

Text

Aug 2025
$0.05 input

 / 

$0.20 output
131K
Lower latency Mixture-of-Experts model trained on OpenAI’s Harmony response format with reasoning capabilities.

OpenAI GPT OSS 120B

Text

Aug 2025
$0.15 input

 / 

$0.60 output
131K
Efficient Mixture-of-Experts model designed for high-reasoning, agentic and general-purpose use cases.

Qwen3 235B A22B-2507

Text

Jul 2025
$0.10 input

 / 

$0.10 output
262K
Efficient multilingual, Mixture-of-Experts, instruction-tuned model, optimized for logical reasoning.

Qwen3 Coder 480B A35B

Text

Jul 2025
$1.00 input

 / 

$1.50 output
262K
Mixture-of-Experts model optimized for agentic coding tasks such as function calling, tool use, and long-context reasoning.

Qwen3 235B A22B Thinking-2507

Text

Jul 2025
$0.10 input

 / 

$0.10 output
262K
High-performance Mixture-of-Experts model optimized for structured reasoning, math, and long-form generation.

MoonshotAI Kimi K2

Text

Jul 2025
$1.35 input

 / 

$4.00 output
128K
Mixture-of-Experts model optimized for complex tool use, reasoning, and code synthesis.

DeepSeek R1-0528

Text

May 2025
$1.35 input

 / 

$5.40 output
161K
Optimized for precise reasoning tasks including complex coding, math, and structured document analysis.

OpenPipe Qwen3 14B Instruct

Text

Apr 2025
$0.05 input

 / 

$0.22 output
33K
An efficient multilingual, dense, instruction-tuned model, optimized by OpenPipe for building agents with finetuning.

Meta Llama 4 Scout

Text

Vision

Apr 2025
$0.17 input

 / 

$0.66 output
64K
Multimodal model integrating text and image understanding, ideal for visual tasks and combined analysis.

DeepSeek V3-0324

Text

Mar 2025
$1.14 input

 / 

$2.75 output
161K
Robust Mixture-of-Experts model tailored for high-complexity language processing and comprehensive document analysis.

Microsoft Phi 4 Mini 3.8B

Text

Feb 2025
$0.08 input

 / 

$0.35 output
128K
Compact, efficient model ideal for fast responses in resource-constrained environments.

Meta Llama 3.3 70B

Text

Dec 2024
$0.71 input

 / 

$0.71 output
128K
Multilingual model excelling in conversational tasks, detailed instruction-following, and coding.

Qwen 2.5 14B Instruct

Text

Sep 2024
$0.06 input

 / 

$0.24 output
32768
Dense multilingual instruction-tuned model with tool-use and structured output support.

Meta Llama 3.1 70B

Text

Jul 2024
$0.80 input

 / 

$0.80 output
128K
Efficient conversational model optimized for responsive multilingual chatbot interactions.

Meta Llama 3.1 8B

Text

Jul 2024
$0.22 input

 / 

$0.22 output
128K
Efficient conversational model optimized for responsive multilingual chatbot interactions.