Disentangling Model Predictions

Created on January 23|Last edited on December 17
Comment
We can visualize a model's performance by clustering its predictions or confidence scores. Below, I examine the predictions of toy CNNs fine-tuned to identify 10 classes of living things (birds, mammals, reptiles, etc—more details in this report). With W&B's newest Embedding Projector panel, it's easy and fascinating to explore patterns in a model's classification. In the live charts below, you can
hover your cursor over any point to see the image at that location
click & drag on the chart area to pan
scroll up/down to zoom in/out on particular region
One ModelEmbeddings shown: Top left: PCA, top right: t-SNE, bottom: UMAP
PCA shows the least clean separation, though birds (Aves, teal) are most reliably distinct/clustered. Insects + Arachnids show substantial overlap
t-SNE quality varies substantially over rounds, with some reliably-separated clusters of the most canonical images
UMAP reliably shows confusion across Plants, Insects, and Arachnids, with Animalia as the least distinct cluster (which makes sense as Animalia is technically a higher-level category in the biological taxonomy than several of the other classes and is the easiest to mistake—e.g. in this dataset it contains a lot of sea creatures, which more closely resemble mollusks)
﻿
﻿
Comparing Two Models?Fist row: baseline model (inception-v3 finetuned for just one epoch)
Second row: better model (double FC layer, finetuned for 5 epochs)
﻿
﻿
Baseline1
﻿
Better model?﻿
Better model1
﻿
Interesting examples
Plants, insects, and spiders often confusedScenes containing all three look very similar, and some samples legitimately contain multiple representatives (here, plants and an insect that might be hard to spot)
﻿
﻿
Unclear what is being photographed﻿
﻿
﻿
﻿
Add a comment