Qwen3 30B A3B

Qwen3 30B A3B inference overview

Price per 1M tokens

$0.10 (input)

$0.30 (output)

Parameters

3.3B (active)

30.5B (total)

Context Window

262K

Release Date

Jul 2025

Qwen3 30B A3B inference details

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts LLM by Qwen, optimized for instruction following in non-thinking mode with 3.3B active parameters per inference. It offers stronger general capabilities across reasoning, comprehension, coding, and tools, plus long-context support up to 262K tokens and improved multilingual performance.

Created by:

Alibaba

License:

apache-2.0

Model card:

Qwen3-30B-A3B-Instruct-2507

				
					import openai
import weave

# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("<team>/<project>")

client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Get your API key from https://wandb.ai/authorize
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="<your-apikey>",

    # Optional: Team and project for usage tracking
    project="<team>/<project>",
)

response = client.chat.completions.create(
    model="Qwen/Qwen3-30B-A3B-Instruct-2507",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)

print(response.choices[0].message.content)

Qwen3 30B A3B resources

Guide

Running Qwen3 Coder on W&B Inference

Course

AI engineering course: Agents

Guide

Qwen3 30B A3B inference overview

Price per 1M tokens

Parameters

Context Window

Release Date

Qwen3 30B A3B inference details

Qwen3 30B A3B resources

The Platform

Article

Resources

Company

Use cases

Industries

Learn more

Qwen3 30B A3B inference overview

Price per 1M tokens

Parameters

Context Window

Release Date

Qwen3 30B A3B inference details

Qwen3 30B A3B resources

The Platform

Article

Resources

Company

Use cases

Industries