Preview

Inference

Instantly access top open-source LLMs on powerful AI infrastructure

W&B Inference powered by CoreWeave provides API and playground access to leading open-source LLMs, including Llama4, DeepSeek, and Phi, allowing Weights & Biases users to develop AI applications and agents without needing to sign up for a hosting provider or host models themselves.

				
					import openai

client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Get your API key from https://wandb.ai/authorize
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="<your-apikey>",

    # Team and project are required for usage tracking
    project="<team>/<project>",
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3-0324",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)

print(response.choices[0].message.content)
				
			

Quickly explore and switch new models

New models with better performance and pricing pop up all the time, but each new model means another provider, another account, and another API key to deal with.

W&B Inference powered by CoreWeave hosts popular open source models on powerful CoreWeave infrastructure that you can readily access with your existing Weights & Biases account via the SDK or the UI. Test and switch between models quickly without signing up for additional API keys or hosting models yourself.

Access models in playground with zero configuration

Explore open-source models instantly in the playground. No model endpoints or access keys required.

Skip the hassle of configuring model endpoints and custom providers, your Weights & Biases account gives you instant access to a wide selection of powerful open-source foundation models, fully hosted on our infrastructure. Zero configuration needed.

Easily iterate on AI applications that use open source models

LLM-powered apps need observability tools, but open-source model hosting providers don’t offer them, forcing developers to juggle disconnected platforms for hosting and observability.

W&B Inference runs directly on CoreWeave infrastructure with observability built-in through W&B Weave to evaluate, monitor, and iterate on AI applications and agents—no extra instrumentation, fragmented workflows, or complexity.

Get started for free

Experimentation can quickly get expensive when every new model you test comes with a separate price plan. 

We host the latest models, ready for inference within your existing Weights & Biases subscription, keeping costs low and simple with a single plan instead of managing multiple providers.

See our pricing page for more information.

The Weights & Biases end-to-end AI developer platform

Weave

Traces

Debug agents and AI applications

Evaluations

Rigorous evaluations of agentic AI systems

Playground

Explore prompts
and models

Agents

Observability tools for agentic systems

Guardrails

Block prompt attacks and harmful outputs

Monitors

Continuously improve in prod

Models

Experiments

Track and visualize your ML experiments

Sweeps

Optimize your hyperparameters

Tables

Visualize and explore your ML data

Core

Registry

Publish and share your AI models and datasets

Artifacts

Version and manage your AI pipelines

Reports

Document and share your AI insights

SDK

Log AI experiments and artifacts at scale

Automations

Trigger workflows automatically

Inference 

Explore hosted, open-source LLMs

The Weights & Biases platform helps you streamline your workflow from end to end

Models

Experiments

Track and visualize your ML experiments

Sweeps

Optimize your hyperparameters

Registry

Publish and share your ML models and datasets

Automations

Trigger workflows automatically

Weave

Traces

Explore and
debug LLMs

Evaluations

Rigorous evaluations of GenAI applications

Core

Artifacts

Version and manage your ML pipelines

Tables

Visualize and explore your ML data

Reports

Document and share your ML insights

SDK

Log ML experiments and artifacts at scale

Get started with Inference