Skip to main content

Create Your Own Preset: Composite Histogram

Log to a custom chart from Python with W&B
Created on October 14|Last edited on April 15

Create a chart interactively; plot data from a script

In this report I describe how to create a custom chart, such as the composite histogram below, with the W&B UI and then log to this custom chart directly from a script (Python training code). Click on the gear icon in the top right to see a slider that adjusts the bin size, giving a slightly different perspective on the data. This chart shows confidence scores of an image classification CNN for two different labels on the same images, specifically for identifying the creature in each image as an amphibian versus a reptile. This toy model is a bit more certain about amphibians (narrower, higher peak) than reptiles (broader range of scores).




Toy CNN example
1


Why use custom charts?

If you're already convinced, skip to the next section for the instructions. Some advantages of this approach:

  • fully control how you group & visualize experiment data: change how you select relevant data from across your runs/projects and precisely how you visualize it. No need to constrain your analysis to the list of plots available in W&B or even follow our data model (e.g. now you can make your own run sets :)
  • log any plot natively and directly in W&B: no need to export your data elsewhere, import external dependencies, log charts as static images of the output of other plotting libraries, or debug in some loop outside of W&B
  • interactive, visual chart development : usually customizing a visualization via code requires rerunning your code many many times and seeing if your changes are correct. With our IDE for custom visualizations, you can load your data once and see the chart re-render live for every change you make to its definition.
  • decouple training and analysis/visualization to save time & compute: log the results of your model training/testing code just once, then change how you query these results and how you visualize them as much as you like, all in W&B, without rerunning your experiment. Moreover, never forget which specific training code made which specific results, because W&B organizes everything in one place.
  • share custom charts with your team and beyond: if you figure out a great way to visualize something (like this overlay of two model score distributions), folks from your team or your whole field can easily reuse your chart

Step 1: Log your data as a wandb.Table()

To read data from your rusn into a chart, first log that data to wandb.Table() object, which is a 2D array where each row is a data point and each column a dimension/feature of that data point. For example, I want to look at the distribution of prediction confidence scores across all 10 classes of my model (a CNN finetuned on photos of 10 classes of living things: birds, insects, plants, etc). Each row in my data is an image from my validation set, and each column is a confidence score for a particular class. Let's say I start with a dictionary of class predictions scores (class_preds) where the keys are class ids and the values are the confidence scores in a fixed order. Now all my prediction scores will be available as arrays by the corresponding column name.

class_names=["Amphibia", "Animalia", "Arachnida", "Aves", "Fungi", "Insecta", "Mammalia", "Mollusca", "Plantae", "Reptilia"]
data  = []
for val_item in range(len(class_preds[0])):
    row  = [class_preds[label][val_item] for label in range(10)]
    data.append(row)

table = wandb.Table(data=data, columns=class_names)
wandb.log({"custom_table_id" : table})

Step 2: Create your custom preset in the UI

W&B custom charts are written in Vega, a powerful and flexible visualization language. You can find many examples and walkthroughs online, and it can help to start with an existing preset that is most similar to your desired custom visualization. You can iterate from small changes in our IDE, which renders the plot as you change its definition. You can see the final Vega spec for my "composite_histogram" at the end of this report. Screen Shot 2020-10-14 at 8.57.14 AM.png

  • from your project workspace or report, click on "Add a visualization" and select "Custom chart".
  • pick an existing preset that is closest in spirit to what you'd like to build: e.g. , the "Bar chart" may be a good foundation if you want to show some form of relative-proportion chart. If you're not sure, start with something simple like "Scatter plot" to get a sense of the overall structure. As in the screenshot above, you will see a modal with the Vega spec on the left, a preview of your chart in the top right, and your run query in the bottom right, which maps the run data into the Vega spec.

Step 3: Map relevant data fields from logged runs into your chart

Modify the run query to feed your run data from Step 1 into the chart: change summary to summaryTable in the query and enter your custom_table_id (the key to which you logged the table in the previous section) as the first entry in keys. In this screenshot, I logged a table under the key hs_0_table, and having logged a column for each of my 10 class names, I chose the "Animalia" column to show up as red_bins and "Plantae" to show up as blue_bins. The title can be hardcoded directly in the Vega spec, or you can use the special placeholder ${field:title:Default Chart Title Here} to allow for a user-defined input string for each instance of the custom chart. Screen Shot 2020-10-14 at 9.08.18 AM.png

Iterate on the Vega spec until you are happy—the panel on the right will re-render interactively as you change the spec or the query. When you are done, make sure to save the Vega spec as a preset (hit "Save As") with a descriptive name (e.g. composite_histogram for me) so you can reuse it within the current project. If you'd like other people to be able to see your report, make sure the preset is publicly accessible.

Step 4: Log to your saved preset from your script!

For any future runs, you can log to your custom preset directly from a Python script with the wandb.plot_table() method:

 # set up field mapping for each two-way class comparison (TODO: use fstrings)
animal_plant = {"red_bins" : "Animalia", "blue_bins" : "Plantae", "title" : "Animals (red) vs Plants (blue)"}
rep_amph = {"red_bins" : "Amphibia", "blue_bins" : "Reptilia", "title" : "Amphibians (red) vs Reptiles (blue)"}
ara_ins = {"red_bins" : "Arachnida", "blue_bins" : "Insecta", "title" : "Arachnids (red) vs Insects (blue)"}
bird_fun = {"red_bins" : "Aves", "blue_bins" : "Fungi", "title" : "Birds (red) vs Mushrooms (blue)"}

for i, fields in enumerate([animal_plant, rep_amph, ara_ins, bird_fun]):
      # create a composite histogram for each two-way comparison
      composite_histogram = wandb.plot_table(vega_spec_name="wandb/composite_histogram", data_table=table, fields=fields)
      # log under a different key to create a separate plot
      wandb.log({"hs_" + str(i) : composite_histogram})

Examples: 100 training samples, 1 epoch

2-way comparisons of class confidence score distributions after very minimal training. Since we're looking at so few training examples, these results are fairly stochastic. You can click on the gear icon in the top right of each plot to adjust the bin size. In this particular experiment, the model returns a much wider range of confidence scores for Amphibians than for Reptiles/Animals/Plants, which are virtually indistinguishable in this view.




100 examples, 1 epoch
1


More examples: 1000 examples, 1 epoch

Some class pairings converge to more similar scores (Animals vs Plants keep a matching distribution) and some diverge (Birds vs Mushrooms go from a near-identical histogram to complete opposites).




1000 examples, 1 epoch
1


P.S. Full Vega Spec for the Composite Histogram

Here is the full Vega spec for my custom composite historgam preset. It is intended as an MVP, and we welcome suggestions for improving or extending it (3 layers? N layers? more custom colors?)—try it and let us know how it goes! We especially welcome any feedback via the Report comments at the bottom of this page.

{
  "$schema": "https://vega.github.io/schema/vega/v5.json",
  "padding": 5,
  "signals": [
    {
      "name": "binOffset",
      "value": 0
    },
    {
      "name": "binStep",
      "value": 0.025,
      "bind": {
        "input": "range",
        "min": 0.01,
        "max": 0.5,
        "step": 0.001
      }
    }
  ],
  "data": [
    {"name" : "wandb"},
    {
      "name": "red_bins",
      "source": "wandb",
      "transform": [
        {
          "type": "bin",
          "field": "${field:red_bins}",
          "extent": [
            -1,
            1
          ],
          "anchor": {
            "signal": "binOffset"
          },
          "step": {
            "signal": "binStep"
          },
          "nice": false
        },
        {
          "type": "aggregate",
          "key": "bin0",
          "groupby": [
            "bin0",
            "bin1"
          ],
          "fields": [
            "bin0"
          ],
          "ops": [
            "count"
          ],
          "as": [
            "count"
          ]
        }
      ]
    },
    {
      "name": "blue_bins",
      "source": "wandb",
      "transform": [
        {
          "type": "bin",
          "field": "${field:blue_bins}",
          "extent": [
            -1,
            1
          ],
          "anchor": {
            "signal": "binOffset"
          },
          "step": {
            "signal": "binStep"
          },
          "nice": false
        },
        {
          "type": "aggregate",
          "key": "bin0",
          "groupby": [
            "bin0",
            "bin1"
          ],
          "fields": [
            "bin0"
          ],
          "ops": [
            "count"
          ],
          "as": [
            "count2"
          ]
        }
      ]
    }
  ],
  "title": {
    "text": "${field:title}"
  },
  "scales": [
    {
      "name": "xscale",
      "type": "linear",
      "range": "width",
      "domain": [0.0, 1.0]
    },
    {
      "name": "yscale",
      "type": "linear",
      "range": "height",
      "round": true,
      "domain": {
        "fields" : [
          {"data" : "red_bins", "field" : "count"},
          {"data" : "blue_bins", "field" : "count2"}
        ]
      },
      "zero": true,
      "nice": true
    }
  ],
  "axes": [
    {
      "orient": "bottom",
      "scale": "xscale",
      "zindex": 1,
      "title": "Frequency count"
    },
    {
      "orient": "left",
      "scale": "yscale",
      "tickCount": 5,
      "zindex": 1,
      "title" : "Number of examples"
    }
  ],
  "marks": [
    {
      "type": "rect",
      "from": {
        "data": "red_bins"
      },
      "encode": {
        "update": {
          "x": {
            "scale": "xscale",
            "field": "bin0"
          },
          "x2": {
            "scale": "xscale",
            "field": "bin1",
            "offset": {
              "signal": "binStep > 0.02 ? -0.1 : 0"
            }
          },
          "y": {
            "scale": "yscale",
            "field": "count"
          },
          "y2": {
            "scale": "yscale",
            "value": 0
          },
          "fill": {
            "value": "firebrick"
          },
          "fillOpacity": {
            "value" : 0.5
          }
        },
        "hover": {
          "fill": {
            "value": "purple"
          }
        }
      }
    },
    {
      "type": "rect",
      "from": {
        "data": "blue_bins"
      },
      "encode": {
        "update": {
          "x": {
            "scale": "xscale",
            "field": "bin0"
          },
          "x2": {
            "scale": "xscale",
            "field": "bin1",
            "offset": {
              "signal": "binStep > 0.02 ? -0.1 : 0"
            }
          },
          "y": {
            "scale": "yscale",
            "field": "count2"
          },
          "y2": {
            "scale": "yscale",
            "value": 0
          },
          "fill": {
            "value": "steelblue"
          },
          "fillOpacity": {
            "value" : 0.5
          }
        },
        "hover": {
          "fill": {
            "value": "purple"
          }
        }
      }
    }
  ]
}