Cerebras launches Cerebras Code with Qwen3-Coder
Created on August 4|Last edited on August 4
Comment
Cerebras has introduced two new pricing tiers aimed at developers who want fast, high-quality AI code generation without being locked into a specific development environment. Cerebras Code Pro, priced at $50 per month, and Code Max at $200 per month, both give users access to Qwen3-Coder, a state-of-the-art open-weight coding model. The core promise is speed and accessibility: users can now generate code at 2,000 tokens per second with a massive 131,000-token context window. Both plans eliminate common constraints like proprietary IDE restrictions and weekly message caps.
Qwen3-Coder as the Engine
The power behind these plans is Qwen3-Coder, developed by Alibaba and positioned as one of the strongest coding-specific large language models available. With 480 billion parameters, Qwen3-Coder competes with leading frontier models like Claude Sonnet 4 and GPT-4.1, especially in agentic and programmatic tasks. Benchmarks such as Agentic Coding, Agentic Browser-Use, and BFCL show Qwen3-Coder performing at or near the top, making it a reliable backend for demanding workflows.
Speed as the Game Changer
Traditional LLM coding experiences are often hindered by latency, especially in agentic workflows that require multiple model calls in sequence. Cerebras claims that its 2,000 tokens-per-second performance helps eliminate this bottleneck. With code generation occurring nearly instantly, the system allows developers to stay in flow, even during complex, multi-step processes involving tool use, retries, or planning.
No IDE Lock-In
A key selling point for Cerebras Code is its open architecture. Any editor or tool that supports OpenAI-compatible endpoints can plug directly into Cerebras Code. That includes popular tools like Cursor, Continue.dev, Cline, and RooCode. This flexibility means developers can integrate high-speed AI code generation directly into their existing workflows with minimal friction.
Availability and Use Cases
Both plans are live and ready for use. Cerebras Code Pro targets indie developers, weekend coders, and simple agent workflows with a daily limit of 1,000 messages. Code Max offers a higher ceiling—5,000 messages per day—for full-time developers, complex integrations, and heavy refactoring tasks. The plans are available without a waitlist, allowing users to sign up immediately and connect their preferred tools.
Conclusion
Cerebras is banking on raw speed and open integration to set its coding product apart. By offering direct access to one of the top-performing open-weight models and removing artificial constraints around usage and environment, the company is making a clear push to become a serious player in the AI developer tooling market.
Add a comment
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.