Skip to main content

Microsoft unveils Phi-3.5

The newest addition to the Phi Series of LLM's!
Created on August 21|Last edited on August 21
Microsoft has taken a bold step in the AI landscape with the introduction of three new models in its Phi series, known as Phi-3.5. This move marks a significant stride forward in AI development, positioning Microsoft ahead of competitors like Google, Meta, and OpenAI. The new models are designed to cater to a wide range of AI applications, including basic reasoning, advanced reasoning, and multimodal tasks involving image and video analysis.

Phi-3.5 Mini Instruct for Compute-Constrained Environments

The Phi-3.5 Mini Instruct model is a lightweight AI solution optimized for scenarios where computational resources are limited. Despite its smaller size of 3.8 billion parameters, the model is capable of handling complex tasks such as code generation, mathematical problem solving, and logic-based reasoning. It supports a 128k token context length, allowing it to manage extended interactions effectively. The model’s performance in multilingual and multi-turn conversational tasks is competitive, surpassing other similarly-sized models on benchmarks like RepoQA.

Phi-3.5 MoE: Microsoft's First Mixture of Experts Model

The Phi-3.5 MoE (Mixture of Experts) model is a first for the Phi Series of models, combining multiple model types into one to tackle a variety of tasks. With a total of 42 billion parameters, the model operates with 6.6 billion active parameters at any given time, ensuring efficiency without sacrificing capability. Phi-3.5 MoE excels in areas such as code, mathematics, and multilingual language understanding, outperforming larger models like GPT-4o mini on benchmarks such as the 5-shot MMLU.


Phi-3.5 Vision Instruct: Leading Multimodal Reasoning

The Phi-3.5 Vision Instruct model completes the trio, focusing on tasks that require both text and image processing capabilities. This model is particularly effective for general image understanding, optical character recognition, chart and table comprehension, and video summarization. Like its counterparts, Vision Instruct supports a 128k token context length, enabling it to handle complex, multi-frame visual tasks. The model was trained using a combination of synthetic and publicly available datasets, with a focus on high-quality, reasoning-dense data.

Open-Source Availability Under MIT License

Microsoft has made all three Phi-3.5 models available under the MIT license, demonstrating its commitment to the open-source community. This license allows developers to use, modify, and distribute the models freely, fostering innovation in both commercial and research settings. The open-source nature of these models, combined with their advanced capabilities, positions Microsoft as a leader in the AI space, encouraging widespread adoption and further development in the field.
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.