Visualizing Prodigy Datasets Using W&B Tables
Use the W&B/Prodigy integration to upload your Prodigy annotated datasets to W&B for easier visualization
Created on August 17|Last edited on September 3
Comment

What is Prodigy?
Prodigy is an annotation tool made by Explosion for creating training and evaluation data for machine learning models, error analysis, data inspection & cleaning.
The W&B x Prodigy integration (docs here) adds a simple and easy-to-use functionality to upload your Prodigy annotated dataset directly to W&B for visualization. This can be done in a single line and will convert the entire dataset to W&B Table format.
Usage
Requirements
Apart from Prodigy, this integration also uses the following libraries:
Code
To use the integration, simply call upload_dataset and pass in the name of the annotated dataset that's in the local Prodigy database.
from wandb.integration.prodigy import upload_datasetupload_dataset("name_of_dataset_in_database")
W&B will automatically try to convert certain images and text fields, such as image URLs and named entity spans, to actual images and spaCy HTML objects. Extra columns may be added to the resulting table to include these visualizations.
Examples
Here are a two examples of Prodigy annotated datasets uploaded to W&B. All data fields, including Prodigy metadata fields such as input hash and task hash, are preserved.
Text with Named Entity Recognition
The spans_visual column added by the integration contains the result of Spacy's NER visualization functionality automatically being applied to all items in the corresponding spans field.
Run set
1
Images
The following table shows a dataset containing images as base64 data URIs.
The integration will add a new image_visual column which contains the result of Spacy's NER visualization functionality automatically being applied to all image fields.
The integration is able to create image visuals out of file paths, URL links, bucket links, and base64-encoded data URIs.
Run set
1
Summary
Hopefully, this simple walk through gives you a nice starting point for visualizing Prodigy datasets using W&B. In the future, we plan on adding more visual functionalities such as converting audio, bounding boxes, masks, as well as expanding the number of fields that can be converted to HTMLs and images. We'd love to see any experiments you're excited about or hear any feedback you have. Thanks!
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.