Musiio Interview Transcript
Created on January 19|Last edited on January 20
Comment
This is supplemental material for this post. Transcription has been lightly cleaned up for readability.
Scott Condron, Weights & Biases:
So maybe you can give us a bit of context about how ML is used in Musiio and maybe how people can leverage audio data generally.
Hazel Savage, Musiio:
Yeah, absolutely. So obviously when we started Musiio, it's worth mentioning, I was the business part of the partnership and my co-founder was the CTO. And really what we are is the two of us together have my industry expertise about the audio industry and then his technical expertise as the AI developer. And I think why that's important in audio is you often see a lot of people that are a solution looking for a problem, you know, a really amazing academic achievement, but then no real world application that can either be monetized or really add anything to the benefit of the industry. So Musiio was sort of directly set to challenge that. And it kind of goes all the way back to when I first started in music at the record store, you know, putting the new CDs on the shelves Sunday night ready for Monday morning, five or six CD singles a week, very manageable, a manageable data set we might say. But you know, you jump forward to 100,000 songs uploaded a day, you know, even if you hired a thousand people, you couldn't listen to all of them. So it becomes a problem of scale and it becomes a problem of volume. And I think that's where AI can be most assistive in audio. It's like you take something that is no longer humanly possible or desirable in some cases and manually listening to a lot of music, not through choice, but as almost as like a data task is also quite taxing. And so taking those things and being able to add a level of automation, that's where I think the value for AI music is.
Scott:
So when you see data explode generally, that's when you think like, okay, when it's no longer feasible possible for humans to be doing the task manually, that's where AI can really assist.
Hazel:
I think it is, yeah. And I think it probably applies to a couple of different industries, because if the solution becomes, oh, well, we just don't know what to do, therefore we do nothing. That therein lies some opportunity. And I think especially with audio, you saw, you know, musicians are very creative. They were sort of coming up with ways to utilize new technology, mobile phone is like recording equipment, you know, like even isn't it Taylor Swift, she sort of jumps down her ideas in voice notes. So the creative people always find their way through. But what I'd noticed was that the industry was a long way behind. And that was the opportunity gap to me.
Scott:
So when you say that there's a solution in search of a problem, how do you see something like generative AI in that world? Do you see that as an opportunity or do you see that also as a solution in search of a problem?
Hazel:
I always say I'm a bit of a skeptic, but with caveats. You know, I think I've seen a lot of very impressive AI models that write music as almost like there's been some cool kind of like parlor trick type stuff, where it's like kind of like, oh, can you guess is it human or is it, you know, generated by AI? And I can think it follows the kind of the visual art world in a similar way. You know, we all love to know that an AI generated artwork sold at Christie's and how much did it sell for? Like, we all we're all obsessed. We want to we want to watch it. But we don't you know, where is that? Where is that painting now? Where is that music now?
Scott:
You know, it's like it doesn't last. It's a hype at the moment because that's novel.
Hazel:
Exactly. A novel always has an interest, but it's not necessarily a sustainable business. So I think that I've always been a little bit skeptical, but with the idea that I want to retain the right to change my mind, you know, I think if somebody can really, you know, solve some of the core issues around, you know, not feeling like it's taking something away from human beings, not feeling like it's a substandard version of what a human would do. And also, you know, how does the technology become more assistive rather than a sort of a takeover? You know, it's even even at the Musiio, even at the Musiio, we had the mirror, the UK Daily Mirror ran an article about us. And the headline was something like robots write music. And I was like, okay, well, A that isn't what we do at all, but also B, I'm excited. I was like, I can't wait to go to the office and find these robots. What it what it what it actually was, was we'd analysed a lot of music on TikTok. And we'd said, you know, the average TikTok song is 30 seconds in length, because of the clipping. It's these genres more commonly, it's instrumental versus more than its vocal. It tends to be in this key, or, you know, this, whatever, whatever has this energy level. Yeah. And we kind of analysed a large set of that data and found the commonalities. And then we had one of the one of the interns on our music team write something that also fit that description. And that was just kind of a fun puff piece for us to kind of go, oh, you know, can we take those insights and have a real musician turn into something. And of course, that one of the tabloids said that we had robots, which I was just quite sad that we didn't.
Scott:
And I guess so putting your business hat on. The world has changed a bit since four years ago, or when you started a Musiio and AI has progressed a lot. If you were to start a new thing with what exists now, where do you think the opportunity is, especially when it comes to music tech?
Hazel:
That is a good one. I think there is more opportunity in collaborative generative AI than there was four and a half years ago when I started Musiio. I do think there are probably more applications, you know, sort of newer and smarter ways to do things with a green and more sustainable cloud. And also, you know, ultimately, what we do at Musiio is still visual ML. And the idea that, you know, my co-founder was experimenting with actual audio ML. So instead of transferring audio into visual and then doing the work from there, what happens if we use the raw audio? And I think there could be some interesting insights to come out of that. It's much harder to scale because the files are much harder to deal with. But I think there's going to be some sort of interesting developments. I mean, even there's another great audio AI company called AudioShake. Full disclosure, I'm an investor. And, you know, they seem to have solved the source separation challenge with AI. And I've been tracking that situation for years. And I've tried every tech there is going and none of them are good enough. There's too much bleed, you know, by the time you remove the vocal and it's on its own, it doesn't sound anything like the original. And I got convinced to give AudioShake a shot because someone told me that it sounds like they've solved it. And I tried it and it genuinely sounds like they have. So we see these leaps forward every few years. And I like the slightly unpredictable nature of that.
Scott:
I guess then leading on from that question, if you consider that basically solved and maybe music understanding, let's say largely solved with Musiio, what unsolved problems are there that you think that you're still tracking now? Like maybe that if a company came out in four more years time, you'd be an investor for that?
Hazel:
You know, I think one of the biggest unsolved challenges in the music industry, I was having a chat with someone about this, is one of our biggest challenges in the industry is the way that we remunerate and pay musicians. And I think that, you know, licensing royalties, how it's managed, it's complicated to the point that I don't have a great understanding of it. I've worked in this industry my whole career. So I think there's opportunity there. But I'm also not sure that it's necessarily a tech driven, it has to be a tech driven solution. And I also am not sure it needs to be a for profit driven solution. And there are some really interesting companies working on this space like Verified Media. And so, you know, until someone figures out how we just make it super easy to get the right billions in the right hands. And I think there's some interesting things coming up. SoundCloud have a user centric model of payments, which I think is really interesting, because I think if you were to ask the average person on the street, how do musicians get paid for their streaming, most people would describe a user centric model to you. You know, if I spent 10 pounds a month and I just listen to your music, you get my 10 pounds, as opposed to you probably get maybe a penny because my 10 pounds goes into the pot. And, you know, X amount, most of it goes to the people who've been listening to the most.
Scott:
Cool. So actually, maybe that's a good time to transition to SoundCloud. As the VPO of intelligence, music intelligence in SoundCloud, what new interesting problems do SoundCloud face when it comes to understanding maybe their catalogs and that the Musiio has solved for them, but also just maybe if you could speak to how does it change when you have now 300 million songs and all this user data as well that you can leverage?
Hazel:
The great thing is SoundCloud were a client before they acquired Musiio. So we already had that sort of great working relationship and they knew that it was almost like an extended due diligence and they knew that our tech worked. But really, you know, the challenge that they have, which I think is an opportunity as well, is that they probably have the biggest catalog of music in the world. So it's like getting up to like 350 million songs, which even if you look at the main DSPs, your Amazon, your Apple, your Spotify, they all say about 90 million. But because SoundCloud has that user generated submissions, you know, I haven't blown my tosh onto there if anyone wanted to listen. So, you know, what they have is this huge volume. And what AI can do better than most is solve the volume challenge. Back to the kind of the original idea, which is, you know, you go from five or six songs a week to 100,000 a day. What do you do when, you know, I remember when I was at Shazam 15 years ago, we used to brag that we could recognize 14 million tracks. Well, that would be a laughably small amount these days. And that number is accelerating sort of exponentially. So one of the best benefits of Musiio is to be able to handle that scale. And then once you can handle it, what you do with it then. So, you know, I always look back to some of the great examples of artists who got their start on SoundCloud. You know, Billie Eilish, technically. Lewis Capaldi was just uploading music there.
Scott:
Drill rappers.
Hazel:
Yeah, a lot of drill rappers. A lot of rappers coming out the Middle East as well. Yeah, there's a lot of really cool underground scenes. So that's why I'm excited to dig in on next year with the data. But if you don't know you have it, you might as well not have it. So what I say is, I think, you know, when I sold my company to SoundCloud, one of the reasons was because they have the most of the music in the world before anyone else has it. That's the opportunity. And so like, but it's only an opportunity if you know you have it. You know, if like Lewis Capaldi, you know, he was uploading music to SoundCloud for years. His, I think it was his manager that ultimately discovered it was on SoundCloud and just looking for anything that had like less than 20 plays and spent like seven months just like clicking around and found him, reached out, connected. Now we have Lewis Capaldi. But that was sat there for seven months. That raw talent. Not that nobody cared, but that nobody could find him. How do we do that? That's the challenge.
Scott:
Yeah. I mean, I think I know SoundCloud's next advertisement campaign and it should be that. They should be leading with it. Cool. So I guess leading in a bit there is that's like a search problem where you might have searching from SoundCloud's perspective, but also maybe for user's perspective. Like how can you connect a user with the exact type of music they want. Maybe you can speak a little bit about that. How can Musiio help the user experience on SoundCloud? Just be your average listener.
Hazel:
Well, so I'm working on a few things in the background at the minute. I'm not sure which of them I'm able to talk about.
Scott:
New ML stuff?
Hazel:
Well, so there's an element of new ML to it, but it's also new applications of ML. So as you sort of correctly identified as well before, just to finish out the last story, it's not just the challenge of search and finding the music. The challenge is when your data sets are as large as they are at SoundCloud. And people are often surprised about this, that Musiio is, AI is half the challenge, the other half is scale. So what you really need are engineers who understand how to manage incredibly large volumes of data and backend engineering plus the AI models. We had a competitor once, similar to our tech, I won't name them, but a Musiio, we can run a Musiio search, drop in a file, an audio file, we'll fingerprint it, we'll search a million tracks and we can return the results in milliseconds. And the only delay is the web speed for uploading the audio file, but the actual search is milliseconds. And we had a competitor that may or may not have had a similar tech, but it took them seven minutes. Now no user is going to sit in front of the interface for seven minutes, just like no customer is going to sit and wait seven minutes for a playlist to load when you get in a car. And I think of that because usually when you want music, especially if you're driving, you get in, you just want to switch it on, a couple of songs, go. So it's really about building the scale and getting the right music in front of the right people. So there are play listing challenges, there are search and recommendation challenges, and then there's talent identification challenges.
Scott:
Interesting. Okay, so the recommendation stuff, maybe how has that evolved over time and how has Musiio made it better?
Hazel:
Well, so what's, so still very new, I will caveat that. And what I will say as well is, you know, the sort of the predominant thought even from outside of SoundCloud is like SoundCloud performs exceptionally well on a few communities. Like if you're a DJ, if you're super into EDM in the UK, or if you're super into American hip hop, and you get yourself into that bubble, that community, then like literally it feels like SoundCloud knows you better than it, you know yourself. Right. But if you are in a sort of a sub community, what if you're into metal, or if you're into like indie rock, or if you're into country and western, it doesn't handle those genres as well as it handles the ones it's most known for. And that's a data issue again.
Scott:
It's a bias problem.
Hazel:
Yeah, it's a volume issue in the data. We know enough about the EDM and the hip hop scenes, and we know enough about those users from data and behavior, that we know what they're going to do and what they're going to like. So this is where AI and ML can come in, because you can replicate that feeling of expertise, and you can almost solve the cold start issue of what do you do when you do have a group of country and western fans, when you do have a group of country and western music. But if you're only using collaborative filtering, there's just no way that you can create that experience.
Scott:
Cool. I guess do you use super users or like these like, I've heard of somewhere that maybe was a competitor that would like have these like, who know people go big before they go big? And how do you leverage those types of users? Or do you? Or is it mostly about the audio, the actual audio itself?
Hazel:
Well at Musiio, it's specifically about the audio. Because what I found is, there was already a ton of other companies doing that other part really well. You've got your chart metrics, you've got your taste makers, love chart metrics as well. But no one else was using only the signals from the audio. But where I think it's really powerful is where you combine those two things. So having, you know, the collaborative filtering, understanding what a user does, but then also understanding at its core the data. It gives you the advantage over both.
Scott:
Okay, so let's round it out. Maybe what are the exciting challenges that you are looking forward to sort of seeing solved in the next year, and now that you've joined SoundCloud?
Hazel:
Well, you know, for me, I always said that Musiio is a third of a billion dollar company. I think to be a big, holistically successful music company in 2022, 2023, you need three things. You need legal access to the data, the music, and a lot of it. You need the technology to know what you have, that's Musiio. And then you need the third part, which is like a label or a publisher, the ability to monetize once you know what you have. And so my idea for Musiio was always that if we're just that middle part, we need either a partner that has the other two, or someone needs to buy us who has those other two. Because if we create that full value chain, sky's the limit.
Scott:
Great. Well, I thought we can end now. Thank you so much. Yeah, you're a seasoned, seasoned professional.
Add a comment