Visualize Audio Data in W&B

Interactively explore ML data and predictions in the audio domain. Made by Stacey Svetlichnaya using Weights & Biases
Stacey Svetlichnaya
W&B Tables—our latest feature for dataset and prediction visualization—enable interactive exploration and analysis of audio data. In this short example, I render whale song as human music: I synthesize melodies from the vocalization of whales and other marine mammals as they would sound on a violin, trumpet, etc. I use Differentiable Digital Signal Processing from Tensorflow's Magenta (resources, colab demo) to generate the music from original recordings in the Watkins Marine Mammal Sound Database.

0. Upload data

For this project, I store my toy dataset in a remote bucket and version it in a reference artifact. The change from logging a regular W&B Artifact is minimal: instead of adding a local path with artifact.add_artifact([your local file path]), add a remote path (generally a URI) with artifact.add_reference([your remote path]). You can read more about reference artifacts here.
import wandbrun = wandb.init(project="whale-songs", job_type="upload")# path to my remote data directory in Google Cloud Storagebucket = "gs://wandb-artifact-refs-public-test/whalesong"# create a regular artifactdataset_at = wandb.Artifact('sample_songs',type="raw_data")# creates a checksum for each file and adds a reference to the bucket# instead of uploading all of the contentsdataset_at.add_reference(bucket)run.log_artifact(dataset_at)
List of file paths and sizes in this reference bucket. Note that these are merely references to the contents, not actual files stored in W&B, so they are not available for download from this view

1. Visualize your data

In this example, I've manually uploaded some marine mammal vocalizations to a public storage bucket on GCP. The full dataset is available from the Watkins Marine Mammal Sound Database, and you can play the sample songs directly from W&B. With a dataset visualization table, you can see and interact with your data directly: listen to audio samples, play videos, see images, and more. This means you don't need to fill up local storage, wait for files to download, open media in a different app, or navigate multiple windows/browser tabs of file directories.
Interact with this example →
Press play/pause on any song and view additional metadata

Filter and organize the data table

You can group by any column: say, group by "species" to listen to different samples from the same marine mammal in one row.
Group by "species" (I also removed the id column which isn't relevant to this view)

Download data from the cloud

Of course you can still fetch the files from the reference artifact and use the data locally:
import wandbrun = wandb.init(project="whale-songs", job_type="show_samples")dataset_at = run.use_artifact("sample_songs:latest")songs_dir = dataset_at.download()# all files available locally in songs_dir

P.S. Visualization by reference in beta

Instead of uploading your content to W&B and versioning with regular Artifacts, you can visualize data and predictions via reference paths (URIs) to remote storage using Artifacts by Reference as I do in this example. This mode is currently in beta.

2. Interactively analyze results

Synthesized samples grouped by instrument. Scroll down inside each panel for more.
Beyond raw training data, you may want to visualize training results: model predictions over the course of training, examples generated with different hyperparameters, etc. You can join these to existing data tables to set up powerful interactive visualizations and analysis. In this case, I have synthesized a few renditions of the marine mammal melodies in different human instruments like violin, flute, and tenor sax, via the amazing DDSP library and Colab Notebook from Magenta for timbre transfer (with a WIP W&B Colab here). These synthetic songs are local .wav files created in a Colab or my local dev environment. Each file is associated with the original song_id and the target instrument.

View generated samples

Play some synthesized samples in a live project
Play and pause the songs and optionally download the files
To see and interact with audio files, log them directly into a wandb.Table associated with an artifact. To visualize a piece of media such as an image, video, or song (audio file) in the browser, we need to wrap it in a wandb object of the matching type—in this case, wandb.Audio(). The wandb object takes in a file path to render the contents of the file. Sample code, assuming my songs live in a local folder called whalesong/synth:
import osimport wandbrun = wandb.init(project="whale-songs", job_type="log_synth")# full path to the specific folder of synthetic songs:synth_songs_dir = "whalesong/synth"# track all the files in the specific folder of synth songsdataset_at = wandb.Artifact('synth_songs',type="generated_data")dataset_at.add_dir(synth_songs_dir)# create a table to hold audio samples and metadata in columnscolumns = ["song_id", "song_name", "audio", "instrument"]table = wandb.Table(columns=columns)# iterate over all the songs and add them to the data tablefor synth_song in os.listdir(synth_songs_dir) # song filenames have the form [string id]_[instrument].wav song_name = synth_song.split("/")[-1] song_path = os.path.join(synth_songs_dir, song_name) # create a wandb.Audio object to show the audio file audio = wandb.Audio(song_path, sample_rate=32) # extract instrument from the filename orig_song_id, instrument = song_name.split("_") table.add_data(orig_song_id, song_name, audio, instrument.split(".")[0])# log the table via a new artifactsongs_at = wandb.Artifact("synth_samples", type="synth_ddsp")songs_at.add(table, "synth_song_samples")run.log_artifact(songs_at)

Group by column names to organize

Group by song_id to see all the transformations of a given song in one row (the same melody played on a flute, violin, trumpet, or tenor sax). You can also group by instrument to compare timbre across melodies.
Find the header of the column you'd like to group by, click on the three dot menu on the right of the column name, and select "Group by" from the dropdown. Try it here.
Compare melodies across different instruments/timbres
Compare timbre across different melodies

Compare original and synthetic songs

Live example→
To listen to both song versions side-by-side, I can join the table of original songs to the table of generated songs:
Query across existing tables to create a new wandb.JoinedTable without duplicating data

Join flexibly across data tables

Join across tables you've logged in earlier artifacts to efficiently create new views for analysis—without duplicating your data. I've logged all the information about the original marine songs in a song_samples table of my playable_songs artifact and about the synthesized songs in a synth_song_samples table of my synth_samples artifact. To compare the original and synthesized versions, I can join these tables on a single key (or a list of two keys) and even change the join type for the sub-tables (inner, outer, etc) from the browser:
run = wandb.init(project="whale-songs", job_type="explore")# original songs tableorig_songs_at = run.use_artifact('playable_songs:latest') orig_table = orig_songs_at.get("song_samples")# synth songs tablesynth_songs_at = run.use_artifact('synth_samples:latest')synth_table = synth_songs_at.get("synth_song_samples")# join the tables on song_idjoin_table = wandb.JoinedTable(synth_table, orig_table, "song_id")join_at = wandb.Artifact("synth_summary", "analysis")join_at.add(join_table, "synth_explore")run.log_artifact(join_at)

Next steps

This is an early proof of concept to illustrate the power of W&B Dataset and Prediction Visualization for the audio space. I hope to explore timber transfer, DDSP, and the Tensorflow Magenta toolset in more depth—and more serious applications like identifying and tracking different marine mammal species based on underwater recordings—in future reports.