Skip to main content

Mistral AI Introduces Mistral Small 3: A High-Performance 24B Open-Source Model

Created on January 30|Last edited on January 30
Mistral AI has unveiled Mistral Small 3, a 24-billion-parameter model designed for efficiency and speed. Released under the Apache 2.0 license, this model offers competitive performance against much larger models like Llama 3.3 70B and Qwen 32B. It provides a high level of instruction-following capability while delivering more than three times the speed of comparable models.

Optimized for Low-Latency Performance

Mistral Small 3 is engineered to handle the majority of generative AI tasks while maintaining a significantly lower response time. Its streamlined architecture, featuring fewer layers than many competing models, allows for faster inference while preserving accuracy. With a processing speed of 150 tokens per second and over 81% accuracy on the MMLU benchmark, it stands as one of the most efficient models in its category.

Comparison with Other Models

Benchmarking results indicate that Mistral Small 3 is on par with Llama 3.3 70B instruct while maintaining a fraction of the computational demand. This makes it a compelling alternative to proprietary models like GPT-4o mini. Unlike some other models, Mistral Small 3 has not undergone reinforcement learning (RL) or training with synthetic data, making it an early-stage foundation model that can be fine-tuned for advanced reasoning tasks.




Human Evaluations and Benchmarks

Mistral AI conducted extensive evaluations with third-party reviewers who compared Mistral Small 3’s responses against other models using anonymized testing. The model demonstrated strong performance across coding, mathematical reasoning, general knowledge, and instruction-following benchmarks. These assessments validate its ability to compete with models significantly larger in size while offering faster response times.

Use Cases and Industry Adoption

Mistral Small 3 is well-suited for applications requiring rapid response times and high accuracy. Potential use cases include conversational AI, low-latency function calling, and domain-specific fine-tuning. Organizations across industries such as finance, healthcare, and manufacturing are already exploring its capabilities for tasks like fraud detection, automated triaging, and command-and-control systems. The model’s efficiency also makes it ideal for local inference, allowing it to run on devices like an RTX 4090 or a MacBook with 32GB RAM.

Integration with AI Platforms

Developers can access Mistral Small 3 through multiple platforms, including Hugging Face, Ollama, Kaggle, Together AI, and Fireworks AI. The model will also soon be available on NVIDIA NIM, Amazon SageMaker, Groq, Databricks, and Snowflake, expanding its accessibility for both cloud-based and on-premises deployments.

Future Roadmap and Open-Source Commitment

Mistral AI continues to expand its lineup of open-source models, reaffirming its commitment to Apache 2.0 licensing. The company plans to enhance both small and large models with improved reasoning capabilities in the near future. As the AI community continues to innovate, Mistral Small 3 serves as a robust foundation for further advancements in efficient and transparent AI development.
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.