Skip to main content

Visualize Audio Data in W&B

Interactively explore ML data and predictions in the audio domain
Created on March 19|Last edited on July 22
W&B Tables—our latest feature for dataset and prediction visualization—enable interactive exploration and analysis of audio data. In this short example, I render whale song as human music: I synthesize melodies from the vocalization of whales and other marine mammals as they would sound on a violin, trumpet, etc. I use Differentiable Digital Signal Processing from Tensorflow's Magenta (resources, colab demo) to generate the music from original recordings in the Watkins Marine Mammal Sound Database.

0. Upload data

For this project, I store my toy dataset in a remote bucket and version it in a reference artifact. The change from logging a regular W&B Artifact is minimal: instead of adding a local path with artifact.add_artifact([your local file path]), add a remote path (generally a URI) with artifact.add_reference([your remote path]). You can read more about reference artifacts here.
import wandb
run = wandb.init(project="whale-songs", job_type="upload")
# path to my remote data directory in Google Cloud Storage
bucket = "gs://wandb-artifact-refs-public-test/whalesong"
# create a regular artifact
dataset_at = wandb.Artifact('sample_songs',type="raw_data")
# creates a checksum for each file and adds a reference to the bucket
# instead of uploading all of the contents
dataset_at.add_reference(bucket)
run.log_artifact(dataset_at)
List of file paths and sizes in this reference bucket. Note that these are merely references to the contents, not actual files stored in W&B, so they are not available for download from this view

1. Visualize your data



Run set
1

In this example, I've manually uploaded some marine mammal vocalizations to a public storage bucket on GCP. The full dataset is available from the Watkins Marine Mammal Sound Database, and you can play the sample songs directly from W&B. With a dataset visualization table, you can see and interact with your data directly: listen to audio samples, play videos, see images, and more. This means you don't need to fill up local storage, wait for files to download, open media in a different app, or navigate multiple windows/browser tabs of file directories.
Press play/pause on any song and view additional metadata

Filter and organize the data table

You can group by any column: say, group by "species" to listen to different samples from the same marine mammal in one row.
Group by "species" (I also removed the id column which isn't relevant to this view)

Download data from the cloud

Of course you can still fetch the files from the reference artifact and use the data locally:
import wandb
run = wandb.init(project="whale-songs", job_type="show_samples")
dataset_at = run.use_artifact("sample_songs:latest")
songs_dir = dataset_at.download()
# all files available locally in songs_dir

P.S. Visualization by reference in beta

Instead of uploading your content to W&B and versioning with regular Artifacts, you can visualize data and predictions via reference paths (URIs) to remote storage using Artifacts by Reference as I do in this example. This mode is currently in beta.

2. Interactively analyze results

Synthesized samples grouped by instrument. Scroll down inside each panel for more.

Run set
1

Beyond raw training data, you may want to visualize training results: model predictions over the course of training, examples generated with different hyperparameters, etc. You can join these to existing data tables to set up powerful interactive visualizations and analysis. In this case, I have synthesized a few renditions of the marine mammal melodies in different human instruments like violin, flute, and tenor sax, via the amazing DDSP library and Colab Notebook from Magenta for timbre transfer (with a WIP W&B Colab here). These synthetic songs are local .wav files created in a Colab or my local dev environment. Each file is associated with the original song_id and the target instrument.

View generated samples

Play and pause the songs and optionally download the files
To see and interact with audio files, log them directly into a wandb.Table associated with an artifact. To visualize a piece of media such as an image, video, or song (audio file) in the browser, we need to wrap it in a wandb object of the matching type—in this case, wandb.Audio(). The wandb object takes in a file path to render the contents of the file. Sample code, assuming my songs live in a local folder called whalesong/synth:
import os
import wandb

run = wandb.init(project="whale-songs", job_type="log_synth")
# full path to the specific folder of synthetic songs:
synth_songs_dir = "whalesong/synth"

# track all the files in the specific folder of synth songs
dataset_at = wandb.Artifact('synth_songs',type="generated_data")
dataset_at.add_dir(synth_songs_dir)

# create a table to hold audio samples and metadata in columns
columns = ["song_id", "song_name", "audio", "instrument"]
table = wandb.Table(columns=columns)

# iterate over all the songs and add them to the data table
for synth_song in os.listdir(synth_songs_dir)
# song filenames have the form [string id]_[instrument].wav
song_name = synth_song.split("/")[-1]
song_path = os.path.join(synth_songs_dir, song_name)

# create a wandb.Audio object to show the audio file
audio = wandb.Audio(song_path, sample_rate=32)

# extract instrument from the filename
orig_song_id, instrument = song_name.split("_")
table.add_data(orig_song_id, song_name, audio, instrument.split(".")[0])

# log the table via a new artifact
songs_at = wandb.Artifact("synth_samples", type="synth_ddsp")
songs_at.add(table, "synth_song_samples")
run.log_artifact(songs_at)

Group by column names to organize

Group by song_id to see all the transformations of a given song in one row (the same melody played on a flute, violin, trumpet, or tenor sax). You can also group by instrument to compare timbre across melodies.
Find the header of the column you'd like to group by, click on the three dot menu on the right of the column name, and select "Group by" from the dropdown. Try it here.
Compare melodies across different instruments/timbres
Compare timbre across different melodies

Compare original and synthetic songs

To listen to both song versions side-by-side, I can join the table of original songs to the table of generated songs:
Query across existing tables to create a new wandb.JoinedTable without duplicating data

Join flexibly across data tables

Join across tables you've logged in earlier artifacts to efficiently create new views for analysis—without duplicating your data. I've logged all the information about the original marine songs in a song_samples table of my playable_songs artifact and about the synthesized songs in a synth_song_samples table of my synth_samples artifact. To compare the original and synthesized versions, I can join these tables on a single key (or a list of two keys) and even change the join type for the sub-tables (inner, outer, etc) from the browser:
run = wandb.init(project="whale-songs", job_type="explore")

# original songs table
orig_songs_at = run.use_artifact('playable_songs:latest')
orig_table = orig_songs_at.get("song_samples")

# synth songs table
synth_songs_at = run.use_artifact('synth_samples:latest')
synth_table = synth_songs_at.get("synth_song_samples")

# join the tables on song_id
join_table = wandb.JoinedTable(synth_table, orig_table, "song_id")
join_at = wandb.Artifact("synth_summary", "analysis")
join_at.add(join_table, "synth_explore")
run.log_artifact(join_at)


Next steps

This is an early proof of concept to illustrate the power of W&B Dataset and Prediction Visualization for the audio space. I hope to explore timber transfer, DDSP, and the Tensorflow Magenta toolset in more depth—and more serious applications like identifying and tracking different marine mammal species based on underwater recordings—in future reports.