Meta AI Releases OPT-175B, Set Of Free-To-Use Pretrained Language Models
Meta AI announced a blog post today that they have released a new set of language models under the name "Open Pretrained Transformer". These models aim to replicate GPT-3 while being freely available for local use and training.
Created on May 3|Last edited on May 3
Comment
In a continuing commitment to open AI collaboration and discovery initiatives, Meta AI announced the release of a set of pretrained language models in a blog post released today. These models are released under the name "Open Pretrained Transformer" with OPT-175B being the highlight of the set at 175 billion parameters (sizes down to 125 million parameters are available as well).
The OPT line of models provide a freely usable set of language models comparable to currently existing models, letting machine learning engineers bypass the exhausting initial training phase.
A paper was released alongside the announcement, going into specific detail about the creation and intention of these models. It's available for viewing here: https://arxiv.org/abs/2205.01068

Democratizing access to large-scale language models with OPT-175B
Open Pretrained Transformer models - truly open language processing
Many model sizes of OPT have been created; Sizes from 125 million to 30 billion parameters (and soon, a 66 billion parameter version) are freely downloadable for local use, while access to the 175 billion parameter version must be requested manually for safety reasons. The models are all available through the Github page here: https://github.com/facebookresearch/metaseq/tree/main/projects/OPT
The models are trained on a number of open source datasets for a full range of standard language model tasks including natural language generation and dialogue, as well as the ability to detect biases and hate speech.
Training models as large as 175 billion parameters takes a lot of time and energy. The OPT models were trained with energy efficiency in mind, and through a few optimizations and using current hardware, they were able to produce these pretrained models at 1/7th the carbon footprint of GPT-3. Because these models are freely available, engineers looking to use these large models can skip the energy-costly initial training phase of making their own from scratch.
The full codebase used in the production of these models is available along with detailed information on how these models were produced, including logs and notes kept through development and tutorials on how to train and use the models yourself, all available on the Github repository here: https://github.com/facebookresearch/metaseq
OPT-175B vs GPT-3, and NLP limitations
OPT-175B is mainly compared against GPT-3 in testing, as OPT-175B's primary goal is to be a replication of GPT-3 (both being 175 billion parameters at their largest). As for evaluation results, OPT-175B offers very similar performance on most tasks save for a few which results were found to be quite erratic.
The benefit that OPT has over GPT is that it's freely available, as mentioned before. GPT-3 is unfortunately not available to be run locally even if you have the hardware to handle it. OPT offers an alternative to GPT-3 for engineers who want to take full control of their own AI solutions.
The reason that the 175 billion parameter model is only available through special request is because language processing models this big come with a slew of safety concerns. From generating toxic language to exhibiting problematic biases and stereotypes, NLP models this large should be kept in the hands of responsible researchers. In additional to the problematic nature of AI language generation, OPT still encounters computer-like limitations such as dialogue looping issues and producing basic factually incorrect statements.
Thus, the 175 billion parameter model OPT-175B is locked behind manually checked access requests.
Find out more
Add a comment
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.