inference

Inference

DeepSeek V3.1

Text

Model overview

Price

$0.55 - $1.65

Input - Output

Parameters

37B - 671B

Active - Total

Context window

128K

Release date

Aug 2025

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context training process, reaching up to 128K tokens, and uses FP8 microscaling for efficient inference. The model improves code generation, and reasoning efficiency, achieving performance comparable to DeepSeek R1-0528 on difficult benchmarks while responding more quickly. It is suitable for research, coding, and agentic workflows. It succeeds the DeepSeek V3-0324 model and performs well on a variety of tasks.

DeepSeek

License: mit

deepseek-ai/DeepSeek-V3.1

Use this model

import openai
import weave
# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("<team>/<project>")
client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',
    # Get your API key from https://wandb.ai/authorize
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="<your-apikey>",
    # Team and project are required for usage tracking
    project="<team>/<project>",
)
response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-V3.1",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)
print(response.choices[0].message.content)

Enter to Rename, Shift+Enter to Preview