Allen Institute for AI Releases OLMo

AI2 introduces OLMo, an open-source framework offering unprecedented access to large language models and their training data.

Brett Young

Created on February 2|Last edited on February 2

Comment

The Allen Institute for AI recently announced the release of its first Open Language Model, called OLMo. By providing the AI research community with open access to data, training code, models, and evaluation tools, AI2 aims to enable researchers to explore a wider array of research questions, including the effects of specific data subsets on model performance and the exploration of new training methodologies.
The Models The initial release includes a suite of models, with four variants at the 7 billion (7B) parameter scale, showcasing different architectures, optimizers, and training hardware, and one model at the 1 billion (1B) parameter scale. All models have been trained on a dataset comprising at least 2 trillion tokens, signaling the start of a series of releases that AI2 plans to continue with larger models and more variants.
﻿
Key highlights of the OLMo release include:
 Access to the full training data, including tools for generating and analyzing this data.
Comprehensive model weights, training and inference code, logs, and metrics to ensure transparency and reproducibility.
Over 500 checkpoints for each model, available through Hugging Face, allowing researchers to explore the training progression.
Evaluation and fine-tuning code to assist in model testing and adaptation.
All resources are released under the Apache 2.0 License, ensuring they are free to use and modify.
Modern Performance OLMo's performance has been rigorously evaluated against other prominent models in the field. For instance, the 7B variant shows competitive or superior performance on various tasks, especially in generative tasks and reading comprehension. .
﻿
The architecture and training process of OLMo incorporate recent advancements, such as the SwiGLU activation function, Rotary positional embeddings, and a modified tokenizer designed to reduce personally identifiable information. 
Open Data Reproducibility is a cornerstone of scientific research, ensuring that results are reliable and valid across different contexts. Open access to data and models facilitates reproducibility by providing all necessary components for others to replicate studies. This openness helps to validate findings, refine theories, and build a more robust understanding of AI systems' behavior and capabilities.
Available Now For researchers and developers eager to explore OLMo, the models are readily available for use and integration into existing projects through Hugging Face, with detailed instructions provided for installation and usage. This initiative also hints at future developments, including instruction-tuned variants and further enhancements to the OLMo framework.
The Announcement: https://blog.allenai.org/olmo-open-language-model-87ccfc95f580﻿
﻿
﻿
﻿
﻿

Add a comment

Tags: ML News

Iterate on AI agents and models faster. Try Weights & Biases today.