DeepSeek models on W&B Inference
DeepSeek V3-0324 inference overview
Price per 1M tokens
$1.14 (input)
$2.75 (output)
Parameters
37B
680B
Context window
161K
Release date
Mar 2025
DeepSeek V3-0324 inference details
DeepSeek V3-0324 is a powerful Mixture-of-Experts model designed for demanding language tasks, including comprehensive information extraction, content summarization, document analysis, and handling complex, structured textual data. It excels in balancing depth and accuracy in high-complexity scenarios.
Created by: DeepSeek
License: mit
🤗 model card: DeepSeek-V3-0324
import openai
import weave
# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("/")
client = openai.OpenAI(
# The custom base URL points to W&B Inference
base_url='https://api.inference.wandb.ai/v1',
# Get your API key from https://wandb.ai/authorize
# Consider setting it in the environment as OPENAI_API_KEY instead for safety
api_key="",
# Team and project are required for usage tracking
project="/",
)
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-V3-0324",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a joke."}
],
)
print(response.choices[0].message.content)
DeepSeek R1-0528 inference overview
Price per 1M tokens
$1.35 (input)
$5.40 (output)
Parameters
37B
680B
Context window
161K
Release date
May 2025
DeepSeek R1-0528 inference details
DeepSeek R1-0528 specializes in tasks requiring detailed reasoning, including mathematics, programming, and logical problem-solving. It’s particularly effective in scenarios like complex coding tasks, planning workflows, and analyzing structured documents with enhanced accuracy and reduced hallucinations.
Created by: DeepSeek
License: mit
🤗 model card: DeepSeek-R1-0528
import openai
import weave
# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("/")
client = openai.OpenAI(
# The custom base URL points to W&B Inference
base_url='https://api.inference.wandb.ai/v1',
# Get your API key from https://wandb.ai/authorize
# Consider setting it in the environment as OPENAI_API_KEY instead for safety
api_key="",
# Team and project are required for usage tracking
project="/",
)
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1-0528",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a joke."}
],
)
print(response.choices[0].message.content)
Deepseek V3.1 on W&B Inference
Price per 1M tokens
$0.55 (input)
$1.65 (output)
Parameters
681B
Context window
128K
Release date
Aug 2025
Deepseek V3.1 inference details
DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. The model improves tool use, code generation, and reasoning efficiency, achieving performance comparable to DeepSeek R1-0528 on difficult benchmarks while responding more quickly. It supports structured tool calling, code agents, and search agents, making it suitable for research, coding, and agentic workflows. It succeeds the DeepSeek V3-0324 model and performs well on a variety of tasks.
Created by: DeepSeek
License: mit
🤗 model card: DeepSeek-V3.1
import openai
import weave
# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("/")
client = openai.OpenAI(
# The custom base URL points to W&B Inference
base_url='https://api.inference.wandb.ai/v1',
# Get your API key from https://wandb.ai/authorize
# Consider setting it in the environment as OPENAI_API_KEY instead for safety
api_key="",
# Team and project are required for usage tracking
project="/",
)
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-V31",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a joke."}
],
)
print(response.choices[0].message.content)