Plot Precision Recall Curves With Weights & Biases

This article provides usage and examples for plotting Precision Recall curves with Weights & Biases using wandb.plot.pr_curve()

Stacey Svetlichnaya

Created on October 7|Last edited on November 8

Comment

With Weights & Biases you can log a Precision-Recall curve in one line of code:
wandb.log({"pr" : wandb.plot.pr_curve(ground_truth, predictions,
                     labels=None, classes_to_plot=None)})
You can log this whenever your code has access to:
a model's predicted scores (predictions) on a set of examples
the corresponding ground truth labels (ground_truth) for those examples
(optionally) a list of the labels/class names (labels=["cat", "dog", "bird"...] if label index 0 means cat, 1 = dog, 2 = bird, etc.)
(optionally) a subset (still in list format) of the labels to visualize in the plot
﻿Try it yourself via Colab →﻿
Basic UsageI finetune a CNN to predict 10 classes of living things: plants, birds, insects, etc. In my validation step, I call
wandb.log({"my_custom_plot_id" : wandb.plot.pr_curve(ground_truth, 
                         predictions, labels=["Amphibia", "Animalia",..."Reptilia"])})
to produce the following curve for each run of my model (where each run logs to the same plot key, my_custom_plot_id). Scroll over the chart area to zoom in, click+drag to pan, and hover to see more detail about a line.
﻿
Toy CNNs3
﻿
Customized UsageTo make this chart more legible, I can simply edit the built in wandb chart definition (or Vega spec), following the Vega visualization grammar and produce this new chart, where the differences between a single class are much easier to spot across runs. 
Now each line's color represents one of my 10 classes, and each line's stroke type—solid, dash, dot—represents one of my three experiments with different numbers of epochs/training examples. You can hover over the top right corner of the chart and click on the "eye" icon to see the full Vega spec.
See the full definition of wandb.plot.pr_curve() →﻿
﻿
﻿
Toy CNNs3
﻿
﻿

Add a comment

Hendrik Schreiber • 4 years ago

Hey Stacey, it seems to me that the ground truth needs to be sparsely encoded, i.e. with class ids. If that's the case, this function is not suitable for multi-label prediction (where you'd use a multi-hot ground truth representation). But it would be *really* useful! Any chance you could a) specify the exact data format(s) you pr_curve understands and b) make it so that it understands either class ids, one-hot, or multi-hot encoding? It would would make it so much more useful. Thanks!!

1 reply

Ayush Thakur • 5 years ago

Hey Stacey, thank you for this report. I have used this report and a bunch of other reports to understand W&B custom charts and use them for my ML works. I have few questions, * I have noticed that wandb logs maximum of 10,000 points. This leads to cropped out curve. I use Python list slicing, data[::sample_rate], where sample_rate is len(data)/10000 to log the points. Is there a better way to do this? * When I save a new copy of the custom chart will it be available to everyone? Thanks in advance. :)

2 replies

Tags: Beginner, Computer Vision, Object Detection, Experiment, W&B Meta, CNN, Custom Charts, Panels, Plots, iNaturalist

Iterate on AI agents and models faster. Try Weights & Biases today.