BLOOMZ & MT0: BigScience Releases New Finetuned Large Language Models

BigScience has released a new lineup of finetuned models based on their own BLOOM model as well as Google's MT5 model: BLOOMZ and MT0 respectively. These models are built specifically for human instruction through natural language.

Teli Davies

Created on November 4|Last edited on November 4

Comment

In July of this year, BigScience released the then-long-anticipated BLOOM model to the public, which still stands as one of the largest free-use large language models to date. Now today, they've announced the release of a couple new sets of language models: BLOOMZ and MT0.
﻿
Creating BLOOMZ and MT0﻿BLOOMZ is a fine-tuned version of BigScience's own BLOOM model, where MT0 is a fine-tuned version of Google's MT5 model. Both were fine-tuned using BigScience's xP3 datasets, creating a variety of new models capable of following human instruction.
There were a few variations of the dataset used to fine-tune the models with specific goals in mind, here are the main three:
﻿xP3: The base version of the dataset, including 13 training tasks across 46 languages with English prompts.
﻿xP3mt: This version includes prompts in 19 additional languages which were all machine translated from the English prompts.
﻿P3: This version has all non-English content stripped out.
The main focus was the base xP3 dataset, and you can see a full range of model sizes fine-tuned with it on BLOOMZ Hugging Face page, ranging from 560M to 176B parameters for BLOOMZ and 300M to 13B parameters for MT0. For the other datasets, 7.1B and 176B parameter models are available for BLOOMZ, and for MT0 it's largest 13B parameter model is available.
What can BLOOMZ and MT0 do?These models were fine-tuned to answer natural language prompts, which includes any kind of question or direction you could think of.
Some examples include: asking the model to translate a phrase from one language to another, explaining a concept in another language, providing a passage and asking a question about it, solving a math problem where the prompt is provided in four different languages and the answer must be given in a fifth, and more.
While the majority of models were fine-tuned using the xP3 dataset which only includes English prompting, prompting still works in any other supported language.
You can play with the models yourself using the hosted inference API on any of the BLOOMZ of MT0 model family Hugging Face pages.
Find out moreExplore the full model lineup at any of the BLOOMZ or MT0 model family Hugging Face pages. Learn more about the xP3 dataset here.
Read the full paper about creating these new models here.
You can also read about BigScience's report on BLOOM's carbon footprint here.
BLOOM: 176-Billion Parameter Multilingual Language Model Ready For Open Use
The long awaited BLOOM has finished training and is ready for use and download by the public. This new model is the largest multilingual languages processing model to date.
﻿
﻿

Add a comment

Tags: ML News

Iterate on AI agents and models faster. Try Weights & Biases today.