Creating Custom Histograms With Weights & Biases
This article provides usage and examples for wandb.plot.histogram() to enable you to log custom histograms natively in just a few lines.
Created on October 9|Last edited on November 15
Comment
Method: wandb.plot.histogram()
Log a custom histogram—sort a list of values into bins by count/frequency of occurrence—natively in a few lines. Let's say I have a list of prediction confidence scores (scores) and want to visualize their distribution:
data = [[s] for s in scores]table = wandb.Table(data=data, columns=["scores"])wandb.log({'my_histogram': wandb.plot.histogram(table, "scores",title="Prediction Score Distribution")})
You can use this to log arbitrary histograms. Note that data is a list of lists, intended to support a 2D array of rows and columns.
You can hover over the bars to see more information. You can use the "eye" icons to the left of the run names to toggle the display of individual runs on/off. You can also hover over the top right corner of any Custom Chart and click on the "eye" icon to see the full Vega spec which defines the chart.
Class = bird confidence scores for toy CNN variants
4
Basic Usage
I finetune a CNN to predict 10 classes of living things: plants, birds, insects, etc. I want to see a frequency count of prediction confidence scores and see how they vary across classes and model variants. For example, is a model more confident on certain classes (histogram peaks at low and high scores) than others (flat even distribution across bins)? How does increasing the amount of training data and the number of epochs change this? I sort all the predictions on the validation data by class and pick a particular class to log (say bird_scores)
data = [[s] for s in bird_scores]table = wandb.Table(data=data, columns=["bird_scores"])wandb.log({'my_histogram': wandb.plot.histogram(table, "bird_scores",title="Bird Confidence Scores")})
Steps to follow:
- create a data object: log each list of values you want to see in the histogram as a column of the 2D data array. By default you can only send one column (via the named header) to one histogram plot, but you can log multiple columns to reference in other plots/code.
- pass data to a wandb.Table() object, specifying all the columns in your data object in order.
- pass the table object and the column name you'd like to visualize to wandb.plot.histogram() with an optional title, which will create your custom plot under the key my_histogram. To visualize multiple runs on the same plot, keep this plot key constant. Note that the table itself will also be logged in the "Media" section of your workspace, under my_histogram_table.
Customized Usage
Here are some simple ones:
- rename the axis titles for clarity: add "title" : "Your Title" to the x and y fields under encoding
- change the orientation of the bars by swapping x and y
- change the stacking of the bars by setting stack to center or zero (instead of overlapping bars, as in the default)
Class = bird confidence scores for toy CNN variants
4
Scale Your Visualizations in Python
You could also log each class's scores as one column, then modify the plot key to collect metrics for each class in one plot (as opposed to for each model version). Here you can see how the score distribution for each class shifts as the model sees more examples over more epochs.
Toy CNN variants
5
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.