Community Spotlight: OpenSoundscape
OpenSoundscape is a public repo trying to make the analysis of bioacoustic data accessible to anyone. Here's what you need to know
Created on April 14|Last edited on April 25
Comment
This post is a piece in our series highlighting some of our favorite community repos that have instrumented W&B. If you'd like your repo to be featured, please reach out to editor@wandb.com
💡
Introduction
With a career based on studying the natural world and the different organisms that inhabit an environment, collecting data in the field is part of the day-to-day for many ecologists. But it's nearly impossible to take note of every little detail all the time. Some observations might require a long time to collect or are challenging in nocturnal settings.
This is all changing now, and rapidly. Access to a whole range of remote sensing technologies has exploded over the past decade. One particular method, bioacoustics, relies on recording and storing the sounds of nature on a massive scale. But how can one decode what nature is saying? That's precisely the impetus behind the project OpenSoundscape, a repo built for analyzing bioacoustic data.

Commonly known as the white-spotted glass frog (Sachatamia albomaculata) seen during fieldwork in Panama - listen here
What is OpenSoundscape?
Today, bioacoustic research could mean collecting upwards of 50,000 hours of audio in a single season. To manually review that data is time-consuming and requires a lot of resources. While leveraging ML could help automate the process of extracting useful information from a large volume of data, many of those who are gathering data in the field don't necessarily have the skills to know how to harness ML in that way.
Here is where OpenSoundscape comes into play. The intention behind the repo is to provide powerful tools to convert audio data into valuable ecological insight. Some of the main tasks the project automates include species detection, estimating the spatial location of sounds, and more. "The goal of OpenSoundscape is to bridge the gap between users who have audio data and the cutting-edge methods made possible through ML," said Sam Lapp, Conservation Researcher at Kitzes Lab, University of Pittsburgh.
And as with many open-source repos, the motivation for developing OpenSoundscape started on a personal level. "A lot of the packages, structure, and content really came out of our own data analysis needs," said Sam. Once the project took off and Sam's team saw how impactful it was to their work, they knew others studying bioacoustics would benefit too. From there, OpenSoundscape became a community tool for ecologists around the world.

Preparing an automated acoustic recorder for deployment at a stream in Panama
How does OpenSoundscape work?
Surprisingly, one of the most powerful ways to apply ML to audio recognition problems is actually by using image recognition. It starts by creating a visual representation of the audio—commonly known as a spectrogram—and then runs image recognition on the spectrogram. And because spectrograms map out frequency over time, it becomes easy to distinguish distinct sound elements in a recording and their harmonic structure, making it extremely helpful for species classification.

A spectrogram of a recording of bird songs coming from these species: Eastern Towhee, Ovenbird, Black and White Warbler, and Blue Headed Vireo - listen here
Why OpenSoundscape uses Weights & Biases?
Before W&B, Sam and his team would run into the black-box problem for many of their ML models. This lack of explainability made it challenging to see what changes were needed to improve the model. "We didn't know if we had to adjust the learning rate or preprocessing parameter or a number of other things," said Sam.
By using W&B Tables, the team could easily visualize and query their datasets, giving them deep insight into where a model is struggling. As opposed to spending hours debugging, now analyzing model performance, and identifying problem areas can happen in real-time.
Past that, W&B centralized all of the team's training regardless of what system they were using. Whether they were training on the Pittsburgh Supercomputing Center’s Bridges2 cluster at the University of Pittsburgh Center for Research Computing or somewhere else, all of their metrics, datasets, logs, code, and system stats lived in a single location.

Using W&B dashboards for training a multi-species CNN for birdsongs
What's next for OpenSoundscape?
To continue the mission of giving ecologists the power to analyze their data, OpenSoundscape wants to add other modes of functionality. Figuring out the localization of different species is one of their next tasks. And it doesn't stop there—the team also wants to provide more interpretive tools to understand model results in a digestible way. "One of the main challenges of a new community like ecology adopting ML is learning to interpret what it means to have a score coming out of the CNN or distribution of scores," said Sam.
How can you get involved with OpenSoundscape?
If you want to learn more about OpenSoundscape, check out their documentation. To try out the project for yourself, go to their GitHub page or follow the team on Twitter @KitzesLab to get all the latest details.
Add a comment
Iterate on AI agents and models faster. Try Weights & Biases today.