Creating Custom Scatter Plots With Weights & Biases

This article looks at the usage and examples for wandb.plot.scatter() and demonstrates how to log a custom scatter plot natively in just a few lines.

Stacey Svetlichnaya

Created on October 8|Last edited on November 16

Comment

﻿
Method: wandb.plot.scatter()Log a custom scatter plot—a list of points (x, y) on a pair of arbitrary axes x and y—natively in a few lines:
data = [[x, y] for (x, y) in zip(x_values, y_values)]
table = wandb.Table(data=data, columns = ["x", "y"])
wandb.log({"my_custom_plot_id" : wandb.plot.scatter(table, "x", "y",
                                 title="Custom Y vs X Scatter Plot")
You can use this to log scatter points on any two dimensions. Note that if you're plotting two lists of values against each other, the number of values in the lists must match exactly (i.e. each point must have an x and a y).
Here is the default scatter plot exploring correlations between confidence scores for two class labels X and Y, across different model variants. The more points appear towards the center of the triangle, as opposed to along the x and y axes, the less certain the model's identification of these classes. You can toggle the point set display on/off by run name/color via the "eye" icon to the left of each run name (click the gray run set tab below the chart to see the full list).
You can pan around via click+drag and zoom in by scrolling over the chart to see more detail, as well as see more information about a point on hover. Try hiding some of the runs below and zooming into the dense region near the origin. You can also hover over the top right corner of the chart and click the "eye" icon to see the full Vega spec which defines the chart.
﻿
﻿
Toy CNN prediction correlations8
﻿
Basic Usage ExampleI finetune a CNN to predict 10 classes of living things: plants, birds, insects, etc. I want to plot the prediction scores for two of my ten classes (on the same validation set of examples) against each other to see if there are any patterns (more or less correlated scores). For example, how often does the model confuse Reptiles and Amphibians (frogs) versus Insects and Arachnids (bugs)? How does this change as we add more examples or train for more epochs?
I pick two lists of scores (of the same length) and in my validation step I call:
data = [[x, y] for (x, y) in zip(class_x_prediction_scores, class_y_prediction_scores)]
table = wandb.Table(data=data, columns = ["class_x", "class_y"])
wandb.log({"my_custom_id" : wandb.plot.scatter(table, "class_x", "class_y")})
Steps to follow:
Create a data object: collect your points as a 2D list/array, where each row is a point and each column is a dimension. This scatter plot assumes two dimensions / two columns, but you could pass in more data and customize the plot further if you wish, e.g. use a third dimension to color the 2D points.
Pass data to a wandb.Table() object in which you name the columns in order so you can refer to them later
Pass the table object and the same x and y column names in order to wandb.plot.scatter() with an optional title, which will create your custom plot under the key my_custom_id. To visualize multiple runs on the same plot, keep this plot key constant. Note that the table itself will also be logged in the "Media" section of your workspace, under my_custom_id_table.
The wandb.plot.scatter API is defined here.
Customized UsageThere are many ways to customize the scatter plot using the Vega visualization grammar. 
Here are some simple ones:
change the appearance and opacity of the point markers: change "mark": {"type": "circle", to  "mark": {"type": "point", "opacity" : 0.3,
rename the axis titles for clarity: add "title" : "Your Title" to the x and y fields under encoding
Here is a pretty advanced one: set opacity conditionally based on the prefix of your run names, so you can display two different run sets in one-panel section (previously impossible in W&B). You will need to create two copies of the panel and the Vega spec and modify the test condition in one of the copies to make the inverse set of runs invisible (i.e. replace "frog" with "bug").
"opacity" : { 
      "condition" :{
        "test" : "indexof(datum.name, 'frog') > -1", 
        "value" : 0}
      }
}
As the number of examples increases from 100 to 1000 and the number of epochs from 1 to 5, the model's confidence at distinguishing these visually similar classes increases. The "bugs" on the left are slightly less confusing (fewer points in the center of the triangle) than the "frogs" on the right.
﻿
﻿
Toy CNN prediction correlations6
﻿
﻿