Qwen3.5 35B A3B

Qwen3.5 35B A3B inference overview

Price per 1M tokens

$0.25 (input)

$1.25 (output)

Parameters

3B (active)

35B (total)

Context Window

262K

Release Date

Feb 2026

Qwen3.5 35B A3B inference details

Qwen3.5-35B-A3B is a multimodal mixture-of-experts model built for fast, efficient inference across chat, reasoning, coding, agents, and vision-language workloads. With 35B total parameters, 3B active parameters, a 262,144-token context window, and support for 201 languages and dialects, it’s a strong choice for production applications that need long context and broad multimodal capability.

Created by:

Alibaba

License:

apache-2.0

Model card:

Qwen3.5-35B-A3B

				
					import openai
import weave

# Weave autopatches OpenAI to log LLM calls to W&B
weave.init("<team>/<project>")

client = openai.OpenAI(
    # The custom base URL points to W&B Inference
    base_url='https://api.inference.wandb.ai/v1',

    # Get your API key from https://wandb.ai/authorize
    # Consider setting it in the environment as OPENAI_API_KEY instead for safety
    api_key="<your-apikey>",

    # Optional: Team and project for usage tracking
    project="<team>/<project>",
)

response = client.chat.completions.create(
    model="Qwen/Qwen3.5-35B-A3B",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Tell me a joke."}
    ],
)

print(response.choices[0].message.content)

Qwen3.5 35B A3B resources

Guide

Running Qwen3 Coder on W&B Inference

Course

AI engineering course: Agents

Guide

Qwen3.5 35B A3B inference overview

Price per 1M tokens

Parameters

Context Window

Release Date

Qwen3.5 35B A3B inference details

Qwen3.5 35B A3B resources

The Platform

Article

Resources

Company

Use cases

Industries

Learn more

Qwen3.5 35B A3B inference overview

Price per 1M tokens

Parameters

Context Window

Release Date

Qwen3.5 35B A3B inference details

Qwen3.5 35B A3B resources

The Platform

Article

Resources

Company

Use cases

Industries