Skip to main content

Google Announces Gemini 1.5

Google's New LLM with a 1 million token context window!
Created on February 15|Last edited on February 15
Google has introduced Gemini 1.5, a new version of its flagship LLM that boasts substantial enhancements in performance, notably in understanding long contexts. Gemini 1.5, including its Pro version, is designed to be more efficient and versatile, performing comparably to its predecessor, Gemini 1.0 Ultra, but with significantly reduced computational demands.

1 Million Token Context

A notable feature of Gemini 1.5 is its capacity for understanding extended contexts, with a capability to process up to 1 million tokens. This feature positions it ahead of other large-scale foundation models in terms of the size of the context window it can handle, promising to enable new applications and enhance the utility of AI models for developers and enterprise customers. The model has been optimized for a variety of tasks across different modalities, including text, code, images, audio, and video, maintaining high performance even as the size of the context window increases.


Familiar Architecture

The architecture of Gemini 1.5 utilizes a Mixture-of-Experts (MoE) approach, which divides the model into smaller, specialized neural networks (similar to the rumored GPT-4 architecture). This method allows the model to selectively activate the most relevant pathways for a given input, significantly improving efficiency in training and serving.
Gemini 1.5 also excels in complex reasoning and problem-solving across vast amounts of information. For example, it can analyze extensive documents, such as large codebases, providing insights and solutions that leverage its deep understanding of the content.

Raising the Bar

In terms of performance, Gemini 1.5 Pro has shown superior results in benchmarks compared to its predecessors, demonstrating its ability to effectively process and learn from long prompts without needing additional training. This capability is particularly evident in its "in-context learning" skills, where it has successfully translated languages from limited information sources.
Ethics and safety have been central to the development of Gemini 1.5, with extensive testing conducted to ensure the model adheres to Google's AI Principles and safety standards. The team has implemented novel research and red-teaming techniques to address potential harms and refine the model for broader use.

Availability

Gemini 1.5 is now available in a limited preview for developers and enterprise customers, offering a glimpse into the future capabilities of AI models. Google plans to introduce pricing tiers for different context window sizes, from the standard 128,000 tokens to the groundbreaking 1 million tokens, as further optimizations are made. This move reflects Google's commitment to responsibly bringing the advancements of Gemini models to a global audience, fostering innovation and utility in AI applications.

Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.