Meta AI Releases New Open-Source Datasets For Mitigating Bias In NLP
Meta AI has released two new datasets created in the hopes of ridding NLP of their problematic biases, as well as sharing a method for creating them and a diverse list of identity descriptors driving it all. And it's all open-source.
Created on May 23|Last edited on May 24
Comment
Natural language processing models have a reputation for generating problematic language, which is often the reason larger models are kept back from a public release. Whether it's blatantly bigoted, passively judgmental, or just reinforcing stereotypes, machine learning models just don't have the cultural understanding or experience a human does to maintain tact.
Beyond just targeted language, NLP models also often fail to account for less commonly accounted for demographic labels, such as more specific racial identities or gender identities beyond male and female. NLP models will also associate certain identities with certain values or concepts, just purely based on the fact the dataset used in training features biases like that.
Today Meta AI has announced a couple datasets, as well as pre-trained models coming soon, which they hope can be used to better help train NLP models and detect their biases.
Read the blog post detailing the project here: https://ai.facebook.com/blog/measure-fairness-and-mitigate-ai-bias/
Meta AI's method for bias mitigation
The method that Meta AI used for generating it's new datasets is called a demographic text perturber model. Essentially, the method takes in an input string, a term within the string that could be changed, and a target demographic. With that information, the perturber generates a new sentence with the chosen term swapped out for something matching the target demographic.

In the above example, the masculine subject of the phrase is exchanged for a gender neutral subject by swapping pronouns; The sentence could also be modified to fit a feminine subject as well. This single sentence is now split into 3 sentences covering a wider range of gender identity.
Next, we could pass each through the perturber again, but replace the sentence's object "grandma" for a wider range of ages, like "granddaughter" or "mom". That brings us up to 9 variations of the original sentence, which we can use in the creation of a dataset featuring a wider range of identities.
Now, where it really comes in handy is dealing with stereotypes. The same process could be applied to the sentence "women like shopping", a classic stereotype, to now sit beside the sentence "men like shopping". It's natural that datasets of sentences gathered from places like web forums will be filled with stereotypes like this, so using a demographic text perturber can automatically extrapolate a stereotype to fit all demographics, rendering the stereotype nullified in the eyes of a training model.
Compiling the demographic terms
To support the demographic text perturber, the researchers worked with algorithmic processes, as well as participation and feedback from domain experts, to create a descriptor list of nearly 600 specific demographic identities across 16 broader categories, covering many facets of human identity.

The open-sourced datasets and models
Everything that went into this project is open-source in the hope that it may help NLP models manage bias better and help produce more demographically varied datasets. Additionally, the researchers are looking for feedback on expanding the descriptor term list.
The full codebase for the project is available here: https://github.com/facebookresearch/ResponsibleNLP
Datasets are downloadable/generatable within their respective project folders, and pre-trained models will be released shortly.
Find out more
Add a comment
Tags: ML News
Iterate on AI agents and models faster. Try Weights & Biases today.