Skip to main content

Qwen unviels a 480B parameter coding model

Qwen has introduced its most advanced coding-focused AI model yet: Qwen3-Coder-480B-A35B-Instruct
Created on July 23|Last edited on July 23
Qwen has introduced Qwen3-Coder-480B-A35B-Instruct, its most advanced coding model to date. Built to handle complex software engineering tasks, it uses a Mixture-of-Experts architecture with 480 billion total parameters and 35 billion active at any time. The model supports an enormous 256,000-token context length natively and can be extended to 1 million tokens using extrapolation. Qwen3-Coder is positioned to rival models like Claude Sonnet 4 on agentic benchmarks, setting new open-source records in areas like tool use, browser tasks, and agentic programming.

Model Architecture and Agentic Capabilities

The Mixture-of-Experts setup enables Qwen3-Coder to activate only a fraction of its parameters for each task, which boosts efficiency without sacrificing performance. This design choice makes the model well-suited for agentic coding, where it must perform tasks that mimic human decision-making. It is not just generating code snippets. It can plan, respond to environmental feedback, and operate across multiple steps to solve end-to-end software tasks. Benchmarks like SWE-Bench emphasize long, realistic workflows, and Qwen3-Coder was trained to excel in those multi-turn, tool-rich scenarios.

Pretraining Improvements and Data Scaling

Qwen3-Coder was pretrained on 7.5 trillion tokens with a heavy focus on source code, which made up about 70 percent of the dataset. Even so, it retains strong performance on general reasoning and math tasks. The training data pipeline used a previous model, Qwen2.5-Coder, to filter and rewrite noisy samples, dramatically improving quality. The model is optimized for very long contexts, giving it the ability to understand and generate outputs for full repositories, large documents, or evolving pull requests. That extended context capacity is critical for agentic workflows, where maintaining memory across many turns matters.

Post-Training: Reinforcement Learning and Long-Horizon Planning

Beyond standard supervised fine-tuning, Qwen3-Coder was trained using large-scale reinforcement learning focused on coding. The team embraced the idea that code is a space where outputs are difficult to write but easy to verify through execution. They built automatic pipelines that create diverse coding problems with corresponding test cases. By training with reinforcement learning over these tasks, the model gained stronger real-world code execution skills.
For agentic planning, Qwen3-Coder also underwent long-horizon reinforcement learning, or Agent RL. This training method allows it to interact with dynamic environments, use external tools, respond to new information, and make complex decisions across many steps. To do this at scale, the team developed a system that runs 20,000 parallel environments on Alibaba Cloud, giving the model the interactive feedback needed to learn effective strategies. This infrastructure supported evaluation and iteration across realistic agent workflows and led to top performance on the SWE-Bench Verified leaderboard.

Qwen Code Command Line Interface

To make the model accessible, the team built Qwen Code, a command-line tool adapted from Gemini Code. This interface includes parser enhancements, custom prompts, and function-calling support that let developers experience the full agentic capabilities of Qwen3-Coder. Qwen Code is compatible with Node.js environments and can be installed either via npm or directly from the source. Once installed, developers can launch interactive coding sessions using just a single command.

Integration with Claude Code and Cline

Qwen3-Coder is also compatible with Claude Code. Users can connect it to the Claude interface by configuring API routes through Alibaba Cloud’s DashScope platform. Support is included for both a Claude proxy and a router customization package called claude-code-router. These tools let developers switch backend models while retaining the familiar Claude development workflow.
The model also integrates with Cline, another coding tool that supports OpenAI-compatible APIs. Developers simply input the API key from DashScope, select OpenAI Compatible as the provider, and set the correct base URL and model name to begin using Qwen3-Coder through the Cline environment. This broad compatibility ensures that developers can try Qwen3-Coder in whichever environment they prefer.

Developer Access and API Use

Qwen3-Coder is accessible through the Alibaba Cloud Model Studio using a standard OpenAI-style API. Developers outside of China can use the international DashScope endpoint, while those inside the mainland have a local option. With just a few lines of Python and environment variables for authentication, developers can query Qwen3-Coder for tasks like webpage creation, code generation, or debugging assistance. This API-first approach makes it easy to integrate the model into existing tools or workflows.

Looking Ahead

Qwen’s team is continuing to refine the Qwen3-Coder lineup. Smaller versions of the model are in development, aimed at offering high performance while reducing cost and infrastructure demands. The team is also exploring more advanced capabilities for the coding agent, including whether it can begin improving its own performance over time. As more software development tasks shift toward automation, Qwen3-Coder represents a step closer to intelligent agents that assist—or even lead—complex engineering projects
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.