Image Masks for Semantic Segmentation

How to log and explore semantic segmentation masks. Made by Stacey Svetlichnaya using Weights & Biases
Stacey Svetlichnaya


When working on semantic segmentation, you can interactively visualize your models' predictions in Weights & Biases. This demo explores just one use case to give you a feel for the possibilities, and the API for logging image masks is very intuitive. Below, I explain the interaction controls for this tool and include some examples for you to try from four model variants, which you can read about in detail in this report. Of course this is helpful for many domains besides self-driving: medical imaging, satellite data, microscopic slides, and more.

Try it yourself in a Colab notebook →

Interactive, stateless, precise visualization

With this tool, I can interact with all of the predictions and the ground truth as separate layers in my browser, without needing to track, save, or restore a bunch of different views of the same image. I can understand a model's behavior on different classes, relative to the ground truth, much faster and more precisely. Finally, I can share these insights much more easily with others by saving my view in a report like this one. Below, I just toggled the "car" class to see that initially, the model predicts the humans in the foreground are cars, but by the end of training, it correctly identifies them as "person". this version

Before this tool, I was analyzing results through a single, fixed composite view. It's hard to remember which colors correspond to which class, hard to distinguish small details, and easy to confuse predictions when the colors of the label mask and the ground truth image all perceptually combine to similar hues:

orig fast ai

I separated these out into side-by-side masks, which helped somewhat but still requires a lot of cognitive overhead to visually diff the images. Plus, when I discover something in this view, it's hard for me to save that exact visual to share it with someone else. our next version

Interaction controls

If you click on the Settings icon in the top left corner of a media panel, you will see this pop-up menu for interacting with the images: menu example

Examples to try

More detailed API walkthrough

Example: Semantic segmentation for self-driving cars

I train a U-Net in to identify 20 different categories relevant to driving scenes: car, road, person, bus, etc. The training data is from the Berkeley Deep Drive 100K, and you can read more details in this report. After each training epoch, I test the latest version of the model on a subset of images from the validation set and log the results as follows:

  {"my_image_key" : wandb.Image(original_image, masks={
    "predictions" : {
        "mask_data" : prediction_mask,
        "class_labels" : class_labels
    "ground_truth" : {
        "mask_data" : ground_truth_mask,
        "class_labels" : class_labels