How to Extend a Preset: Histogram Bins in Weights & Biases

This article explains how to easily adapt a Weights & Biases Custom Chart preset for individual, specific use cases.
Created on November 6|Last edited on November 24
Comment
How do you modify one of the W&B Custom Charts for your particular project? In this article, I extend the Custom Histogram to 
adjust details like bin size and max height by editing the chart code in Vega﻿
save the modified version as a custom preset
log to this new chart type programmatically from Python. 
Table of ContentsCustom Charts Are Fully EditableDefault HistogramsCustomization Steps:  Edit the Vega SpecReusing the New Custom PresetLogging Directly From PythonBuild Precisely the Chart You Want
﻿
Custom Charts Are Fully EditableYou can follow these steps to customize any of our presets (line plot, bar plot, scatter plot, histogram, PR curve, and ROC curve). 
The default histogram appears on the left below (from Custom Histogram). On the right, I've edited the default histogram to use smaller bins and cut off at a count of 50 so I can see the fine detail. If you'd like to try this yourself in a Colab notebook, I recommend
﻿this great overview Colab →﻿
﻿this more specific & advanced case→﻿
Powerful InteractionsIn both charts, you can zoom, pan, and hover to see more information. Both charts show four different model variants (same validation data, different epoch count and numbers of training examples) on the same axes for easy comparison. You can use the "eye" icons to the left of the run names to toggle the display of individual runs on/off.
﻿
Confidence scores, class = bird for toy CNNs4
﻿
﻿
What Does This Chart Mean?I finetune a CNN to predict 10 classes of living things: plants, birds, insects, etc. I want to see a frequency count of prediction confidence scores and see how they vary across classes and model variants. 
For example, is a model more confident on certain classes (histogram peaks at low and high scores) than others (flat even distribution across bins)? I vary NT,  the number of training examples  and E, the number of training epochs for each run. Both numbers are tiny for illustration purposes. When these are too small, the model gives very low confidence scores (<0.1). With increasing epochs and training examples, we start to see more high confidence scores and some intermediate scores for the model's prediction confidence (in this case, that the image shows a bird).
Default HistogramsThis chart lets you sort a list of values into bins by count or frequency of occurrence. Let's say I have a list of prediction confidence scores (scores) for model, and I want to see their distribution:
data = [[s] for s in scores]
table = wandb.Table(data=data, columns=["scores"])
wandb.log({'my_histogram': wandb.plot.histogram(table, "scores", title=None)})
Note that data is a list of lists, intended to support a 2D array of rows and columns.
Customization Steps:  Edit the Vega SpecTaking the first histogram in this report as an example, I want to make two changes:
make the bins narrower so I can see more fine-grain detail along the x-axis
threshold the counts at a lower value so I can zoom in on relative differences between the bars
Hover over the top right corner of an existing chart and click on the "edit" pencil to open the custom chart modal. Here you can change the query fields, or how your data is loaded into the histogram, if needed.
﻿
﻿
Click "Edit" in the top left, next to the name of the W&B global preset you're currently using to open the interactive visualization editor. The Vega spec on the left is a full definition of the chart in the  Vega visualization grammar. You can find lots of Vega tutorials and examples online, and it's very easy to tinker with small details in this json format.

﻿
﻿
Iteratively make changes to the Vega spec and see their effect.  If you're not sure how to make the changes, search for relevant Vega examples (e.g. I used this reference on binning in histograms. Our IDE is also very friendly to iterative development, and I've found Vega syntax to be reasonably intuitive. Here I change two lines:
on line 16,  change "bin" : true, to "bin" : {"binned" : false, "step" : 0.025}, to hardcode the bin width along the x-axis to 0.025. You can always adjust this number for different instances of the chart.
under the definition of "y" on line 20, add "scale" : {"domain" : [0, 50]}, to limit the vertical range of the chart
optionally, if you want to set the title programmatically instead of through the UI, change "title": "${string:title}", to   "title": "${field:title}",
Here is my full Vega spec: 
{
  "$schema": "https://vega.github.io/schema/vega-lite/v4.json",
  "description": "A simple histogram",
  "data": {
    "name": "wandb"
  },
  "selection": {
    "grid": {
      "type": "interval", "bind": "scales"
    }
  },
  "title": "${field:title}",
  "mark": {"type": "bar", "tooltip": {"content": "data"}},
  "encoding": {
    "x": {
      "bin" : {"binned" : false, "step" : 0.025},
      "type": "quantitative",
      "field": "${field:value}"
    },
    "y": {
      "aggregate": "count",
      "scale" : {"domain" : [0, 50]}, 
      "stack": null
    },
    "opacity": {"value": 0.6},
    "detail": [{"field": "name"}, {"field": "color"}],
    "color": {
      "type": "nominal",
      "field": "name",
      "scale": {"range": {"field": "color"}}
    }
  }
}
When you're happy with your changes, save the result so you can reuse it. I named my chart "histogram_small_bins" and made it publicly accessible so that everyone who reads this report can view charts of that type. I recommend this setting if you're sharing the report with anyone and want them to be able to see your charts.
Reusing the New Custom PresetSince I've already logged some histograms using the default global "Histogram" preset, I can now change them to my improved custom "histogram_small_bins" format. Click the pencil to edit, find the new preset's name in the dropdown menu from the top left, and select the one you'd like to see. Here I show two before & after charts of scores for different classes (mollusks and reptiles). You can see how the score distribution for each class shifts as the model sees more examples over more epochs, from light green (wider distribution) to blues (more low scores) to purple (bimodal at high and low). You can toggle the individual model variants on/off using the "eye" icon to show more/fewer overlapping distributions.
Shared Presets vs. One-Off Edits Note that you can also make further one-off changes on top of your new preset. For example, in the bottom right, I've edited the Vega spec to show a vertical range of 100. This change is local to this single chart instance and won't modify the "histogram_small_bins" preset. Each time you edit the Vega spec, you can decide whether to save your edits to the shared preset (via "Push changes") or only apply them to the current chart (via "Detach").
﻿
﻿
Toy CNN variants4
﻿
Logging Directly From PythonBeyond editing these charts in the UI, you can now log them directly from Python. Once I'm happy with my "histogram_small_bins" preset, and I have the prediction scores for a particular class (say class=Plantae) in my validations step as the array of floats called plantae_scores:
data = [ [ndx, score] for ndx, s in enumerate(plantae_scores)]
table = wandb.Table(data=data, columns=["id", "score"])
fields = {"value" : "score",  "title" : "Plantae prediction scores"}
custom_histogram = wandb.plot_table(
    vega_spec_name="wandb/histogram_small_bins",
    data_table = table,
    fields = fields)
wandb.log({"custom_id" : custom_histogram})
The steps in more detail:
create a data object: log each list of values you want to see in the histogram as a column of the 2D data array. By default you can only send one column (via the named header) to one histogram plot, but you can log multiple columns to reference in other plots/code.
pass data to a wandb.Table() object, specifying all the columns in your data object in order.
specify which fields in your preset map to which columns of the table. The default histogram has one field for the numbers, "value", which I map to the "score" column. I also modified the "title" field to be set programmatically, so I pass in a string for that from my Python script.
pass this table and fields to wandb.plot_table, specifying the name of your preset with your username/team name as the prefix (in this case, team "wandb")
log the result and you're done! The preset will show up in the "Custom Charts" section of your workspace for an individual run. If you'd like to compare multiple runs on the same set of axes, make sure to keep "custom_id" fixed for that set of runs. Note that the table itself will also be logged in the "Media" section of your workspace, under custom_id_table.
For a slightly fancier example, check out this Colab. I didn't want to specify each of the 10 classes by hand, so I just log all of them for each run, creating 10 versions of my new preset in a for loop (see 9 below :). All of these charts have the settings adjustments I made to "histogram_small_bins", starting from the "Histogram" global preset. The overall trend in score distributions is about the same: with more training examples and more epochs, the scores distribution narrows to a bimodal one. It's interesting to explore the differences between classes and model variants (toggle the runs on & off)—keep in mind that these are small (as few as 100 training examples for only 1 epoch) and thus fairly noisy experiments.
﻿
﻿
Toy CNN variants6
﻿
﻿
Build Precisely the Chart You WantThese presets and our accompanying APIs aim to be simple and general, but for most of machine learning work, a "standard" approach is insufficient. This is why we recently launched an interactive dev environment for machine learning data visualization. It greatly expands the set of possible charts folks can log natively to W&B (check out the presets in our gallery and the types of queries they can make (e.g. different ways to aggregate, filter, combine, and otherwise parse logged data logged). Most importantly, it lets you interactively adjust your visualizations to fit your exact requirements and then share them with teammates & the whole field.
We're very excited for folks to try this feature—please let us know how it goes in the Comments section below.
﻿
﻿
Add a comment
Tags: Beginner, Domain Agnostic, Tutorial, W&B Meta, Custom Charts, Panels, Plots, iNaturalist
Iterate on AI agents and models faster. Try Weights & Biases today.