Unintended Bias in Toxicity Classification

Created on March 20|Last edited on March 20
Comment
﻿
Brainstorm (Remove before publishing)EDAPart 1 - EmbeddingsBias benchmark - How choice of embedding affects biasPlotsConfusion matrices for using different embeddings on baseline model - look for counterfactuals
Plots to compare performance of different embeddings 
Part 2 - Model TrainingBias benchmark - How choice of model affects biasPlotsConfusion matrices for different models on one embedding
- look for counterfactuals	
Explain model. Vega plots to add:
 ELI5, LIME
 NER, POS
* Attention
Sweeps - optimize modelThe Goal - Find Unintended Bias in Toxic TweetsThe Dataset
Toxicity Subtypes Distribution
Toxic Subtypes and Identity Correlation
Lexical Analysis
Toxicity by Identity Tags (Frequency)
Weighted Analysis of Most Frequently Toxic Tags
Correlation between identities - which identities are mentioned together?
Time Series Analysis of Toxicity
Word Clouds
All Identities
Emoji Usage in Toxic Comments
Word EmbeddingsWord embeddings accept text corpus as an input and outputs a vector representation for each word. We use t-SNE to draw a scatter plot of similar words in the embedding space.
Affect of Embeddings on BiasBias BenchmarksSubgroup AUC: The AUC score for the entire subgroup- a low score here means the model fails to distinguish between toxic and non-toxic comments that mention this identity.
BPSN AUC: Background positive, subgroup negative. A low value here means the model confuses non-toxic examples that mention the identity with toxic examples that do not.
BNSP AUC: Background negative, subgroup positive. A low value here means that the model confuses toxic examples that mention the identity with non-toxic examples that do not.
The final score used in this competition is a combination of these bias metrics, which we will also compute.
No Pretrained Embeddings - Final Metric: 0.90
GloVE - Final Metric: 0.9230
FastText - Final Metric: 0.9228
Concatenate GloVe and Fasttext - Final Metric: 0.9234
Model Interpretation - Named Entity Recognition, Eli5NERdisplacy.render(nlp(str(sentence)), jupyter=True, style='ent')
TextExplainerLet's use ELI5 to see how model makes predictions.
﻿
﻿
Add a comment