AI in Music: The Google vs. UMG Deal
One of the things I love about “Generative AI”, a new type of AI that produces pseudo-novel content, is that it has the potential to automate a lot of tedious things. One important thing to do when blogging about a subject, which is frankly not too fun, is summarize the news. One of the most frightening things about language models like Google’s “Bard” is just how good they actually are at summarizing content. Rather than write an extensive background section on this topic, I asked Bard to summarize the “Google and UMG negotiations around AI in music”. Here are the results:
From Google Bard: “Google and Universal Music Group (UMG) are in talks to license artists' melodies and voices for artificial intelligence (AI)-generated songs, according to a report by the Financial Times. The goal of the talks is to develop a tool that would allow fans to create AI-generated tracks using artists' likenesses in a legal way.
The negotiations come amid growing concerns about the use of AI to create deepfakes, which are audio or video recordings that have been manipulated to make it appear as if someone is saying or doing something they never said or did. In the music industry, deepfakes have been used to create songs that feature the voices of deceased artists or artists who have not given their consent.
The deal between Google and UMG would address some of the concerns about deepfakes by requiring artists to give their consent before their voices and melodies can be used in AI-generated songs. It would also ensure that copyright holders are compensated for the use of their music.
The negotiations are still in early stages, and it is not yet clear when a deal might be reached. However, if the deal is successful, it could pave the way for a new era of AI-generated music that is both legal and ethical.”
Now, we will talk much more about language models in the future, but if you are a bit familiar with them, then you are likely familiar with the concept of “hallucination” where the language model comes up with false yet probable sounding replies to prompts. Because of this phenomenon, it is important to fact-check language models. Accordingly, I requested that Bard produce some references for its answer. Here is a great list to do your own homework if you’re interested:
From Google Bard:
Google and Universal Music in talks over deal to combat AI deep fakes: Report: https://cointelegraph.com/news/universal-music-google-negotiate-ai-deep-fakes
Google And Universal Music Group Negotiating AI-Generated Music Tool, Report Says: https://www.forbes.com/sites/tylerroush/2023/08/08/google-and-universal-music-group-negotiating-ai-generated-music-tool-report-says/
Google and Universal Music working on licensing voices for AI-generated songs: https://www.theguardian.com/technology/2023/aug/09/google-and-universal-music-working-on-licensing-voices-for-ai-generated-songs
Spotify Removes 'Tens Of Thousands' Of AI-Generated Songs: Here's Why: https://www.forbes.com/sites/ariannajohnson/2023/05/09/spotify-removes-tens-of-thousands-of-ai-generated-songs-heres-why/
Professional Opinion
As someone who works in the AI Research community, and as a practicing musician, the intersection of music and artificial intelligence is a topic of great interest. No doubt, as the ability of generative modeling improves, there will be increased employment of generative models to create music. We have seen a similar trend occur in the image space. Tools such as Midjourney which employ a technology referred to as “stable diffusion” (we will cover this in the future) to generate images have quickly become a way to generate simple graphics, and in the case of some overly ambitious and misguided studios, it has even been used to create poor CGI (looking at you, Disney).
Audio, and music, are likely headed in a similar direction, but not quite the same. As has been pointed out by many people looking at the problem of copyright on data, one of the key factors providing music a huge leg-up on visual arts is the long-standing and robust copyright protection mechanisms that the music industry has created. Graphic artists, painters, illustrators, and other visual artists have not had the benefit of the music industry ecosystem in which large record labels with extensive resources and hordes of elite lawyers ensure that artists do not have their work plagiarized.
The metaphorical muscle of the music industry has been flexed before, sometimes to such ridiculous extents that many music lawsuits become quite famous. Think of, say, Vanilla Ice v. Queen and David Bowie (1990) or Robin Thicke and Pharrell Williams v. Marvin Gaye Estate (2014). The songs in question were “Ice Ice Baby” and “Blurred Lines” respectively. In Music, there is a long-standing history of legal precedent, and rules around what is and is not eligible for re-use, or sampling. Accordingly, with the emergence of generative AI, a very well-equipped music industry is prepared and eager to negotiate with the biggest players in tech, and they bring quite a lot of leverage to the negotiating table. Without legal access to the massive datasets that major music labels have, it may be decades before the tech industry can produce high-quality AI Music generation. However, with legal access to the data, we may see high-quality AI music generation within a year or so.
In the next post, we will discuss the possibilities of high-quality music generation given access to large high-quality datasets. Until then, here is a recent paper from Meta AI Research reporting on their work on music generation: