Custom Histograms

Usage and examples for wandb.plot.histogram(). Made by Stacey Svetlichnaya using Weights & Biases
Stacey Svetlichnaya

Method: wandb.plot.histogram()

Log a custom histogram—sort a list of values into bins by count/frequency of occurrence—natively in a few lines. Let's say I have a list of prediction confidence scores (scores) and want to visualize their distribution:
data = [[s] for s in scores]table = wandb.Table(data=data, columns=["scores"])wandb.log({'my_histogram': wandb.plot.histogram(table, "scores", title="Prediction Score Distribution")})
You can use this to log arbitrary histograms. Note that data is a list of lists, intended to support a 2D array of rows and columns.
You can hover over the bars to see more information. You can use the "eye" icons to the left of the run names to toggle the display of individual runs on/off. You can also hover over the top right corner of any Custom Chart and click on the "eye" icon to see the full Vega spec which defines the chart.

Basic usage

I finetune a CNN to predict 10 classes of living things: plants, birds, insects, etc. I want to see a frequency count of prediction confidence scores and see how they vary across classes and model variants. For example, is a model more confident on certain classes (histogram peaks at low and high scores) than others (flat even distribution across bins)? How does increasing the amount of training data and the number of epochs change this? I sort all the predictions on the validation data by class and pick a particular class to log (say bird_scores)
data = [[s] for s in bird_scores]table = wandb.Table(data=data, columns=["bird_scores"])wandb.log({'my_histogram': wandb.plot.histogram(table, "bird_scores", title="Bird Confidence Scores")})
Steps to follow:

Customized usage

There are many ways to customize the line plot using the Vega visualization grammar.
Here are some simple ones:
See a more detailed walkthrough of histogram customization →
See the full definition of wandb.plot.histogram() →

Scale your visualizations in Python

You could also log each class's scores as one column, then modify the plot key to collect metrics for each class in one plot (as opposed to for each model version). Here you can see how the score distribution for each class shifts as the model sees more examples over more epochs.