DeepSeek models on W&B Inference

DeepSeek V3-0324 inference overview

Price per 1M tokens

$1.14 (input)
$2.75 (output)

Parameters

37B
680B

Context window

161K

Release date

Mar 2025

DeepSeek V3-0324 inference details

DeepSeek V3-0324 is a powerful Mixture-of-Experts model designed for demanding language tasks, including comprehensive information extraction, content summarization, document analysis, and handling complex, structured textual data. It excels in balancing depth and accuracy in high-complexity scenarios.

Created by: DeepSeek

License: mit

🤗 model card: DeepSeek-V3-0324

import openai
import weave

# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("<team>/<project>")

client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Get your API key from https://wandb.ai/authorize
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="<your-apikey>",

    # Team and project are required for usage tracking
    project="<team>/<project>",
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3-0324",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)

print(response.choices[0].message.content)

DeepSeek R1-0528 inference overview

Price per 1M tokens

$1.35 (input)
$5.40 (output)

Parameters

37B
680B

Context window

161K

Release date

May 2025

DeepSeek R1-0528 inference details

DeepSeek R1-0528 specializes in tasks requiring detailed reasoning, including mathematics, programming, and logical problem-solving. It’s particularly effective in scenarios like complex coding tasks, planning workflows, and analyzing structured documents with enhanced accuracy and reduced hallucinations.

Created by: DeepSeek

License: mit

🤗 model card: DeepSeek-R1-0528

import openai
import weave

# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("<team>/<project>")

client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Get your API key from https://wandb.ai/authorize
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="<your-apikey>",

    # Team and project are required for usage tracking
    project="<team>/<project>",
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-0528",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)

print(response.choices[0].message.content)

Deepseek V3.1 on W&B Inference

Price per 1M tokens

$0.55 (input)
$1.65 (output)

Parameters

681B

Context window

128K

Release date

Aug 2025

Deepseek V3.1 inference details

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek R1-0528 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows. It succeeds the DeepSeek V3-0324 model and performs well on a variety of tasks.

Created by: DeepSeek

License: mit

🤗 model card: DeepSeek-V3.1

import openai
import weave

# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("<team>/<project>")

client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Get your API key from https://wandb.ai/authorize
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="<your-apikey>",

    # Team and project are required for usage tracking
    project="<team>/<project>",
)

response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V31",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)

print(response.choices[0].message.content)

DeepSeek models on W&B Inference

DeepSeek V3-0324 inference overview

Price per 1M tokens

Parameters

Context window

Release date

DeepSeek V3-0324 inference details

DeepSeek R1-0528 inference overview

Price per 1M tokens

Parameters

Context window

Release date

DeepSeek R1-0528 inference details

Deepseek V3.1 on W&B Inference

Price per 1M tokens

Parameters

Context window

Release date

Deepseek V3.1 inference details

DeepSeek V3 resources

A primer on building successful AI agents

W&B Inference powered by CoreWeave guide

AI engineering course: Agents

The Platform

Article

Resources

Company

Learn more

DeepSeek models on W&B Inference

DeepSeek V3-0324 inference overview

Price per 1M tokens

Parameters

Context window

Release date

DeepSeek V3-0324 inference details

DeepSeek R1-0528 inference overview

Price per 1M tokens

Parameters

Context window

Release date

DeepSeek R1-0528 inference details

Deepseek V3.1 on W&B Inference

Price per 1M tokens

Parameters

Context window

Release date

Deepseek V3.1 inference details

DeepSeek V3 resources

A primer on building successful AI agents

W&B Inference powered by CoreWeave guide

AI engineering course: Agents

The Platform

Article

Resources

Company