0

The RIAA’s lawsuit against generative music startups will be the bloodbath AI needs | TechCrunch

Like many AI companies, Udio and Suno relied on massive piracy to create their generative AI models. They have admitted this before the music industry’s new lawsuit against them goes before a judge. If it goes before a jury, this lawsuit could be a damaging revelation and an extremely useful precedent for similarly unethical AI companies that are facing some legal threats.

These cases were filed with great fanfare on Monday We have all been put in the uncomfortable position of supporting the RIAA by the Recording Industry Association of America, which has haunted digital media for decades. I’ve received nasty messages from them myself! It’s very clear.

The gist of both lawsuits, which are very similar in subject matter, is that Suno and Udio (strictly speaking, Uncharted Labs, doing business as Udio) indiscriminately plundered nearly the entire history of recorded music to create datasets that they used to train music-generating AI.

And here we quickly note that these AIs “generate” the user’s signal rather than matching it to a pattern from their training data and then attempting to complete that pattern. In a way, all these models do is simply perform covers or mashups of the songs they have absorbed.

That Suno and Udio have ingested the aforementioned data is, for all intents and purposes (including legal purposes), indisputably true. The companies’ leadership and investors have been foolishly silent about the copyright challenges in this area.

They have recognized that the only way to create a good music creation model is to ingest a large amount of high-quality music, most of which will be copyrighted. This is a very essential step to creating this type of machine learning model.

He then admitted that he had done so without the permission of the copyright owners. Brian Hiatt told Rolling Stone Just a few months ago:

Honestly, if we had a deal with the label when we started this company, I probably wouldn’t have invested in it. I think they needed to create this product without any hassles.

Tell me you stole a century’s worth of music without telling me you stole a century’s worth of music, understand. To be clear, by “obstacles” he means copyright law.

Finally, the companies told the RIAA’s lawyers that they believed stealing all this media fell under the fair use doctrine – which basically only applies to unauthorized use of a work. Now, fair use is certainly a complex and nebulous concept in idea and execution. But a company with $100 million in its pocket stealing every song ever created so it can copy them en masse and sell the results: I’m not a lawyer, but that seems like it’s somewhat outside the intended safe harbor, like a seventh grader using a Pearl Jam song in the background of his video on global warming.

To put it bluntly, it seems these companies have their work cut out for them. They clearly hoped they could learn from OpenAI’s tactics, secretly use copyrighted works, then use evasive language and misdirection to fend off their less wealthy critics, such as writers and journalists. If they are the only option for distribution until the AI ​​companies’ fraud is exposed, it doesn’t matter.

In other words: deny, avoid, delay. Ideally you can continue this until the situation changes and you reach agreement with your critics – for LLM, it’s news outlets and the likeAnd in this case it would be the record labels, which the music generators clearly expect to eventually come from a position of power. “Sure, we stole your stuff, but it’s a big business now; wouldn’t you prefer to play with us rather than against us?” It’s a common tactic in Silicon Valley and a winning one, because it mainly just costs money.

But that’s hard to do when you have no solid evidence in your hands. And unfortunately for Udio and Suno, the RIAA included a few thousand solid pieces of evidence in the lawsuit: songs it owns that are clearly being replicated by the music model. Whether it’s the Jackson 5 or Maroon 5, the “generated” songs are mildly distorted versions of the original songs – something that would be impossible if the originals weren’t included in the training data.

The nature of LLs – in particular, their tendency to get confused and forget plot the more they write – prevents, for example, the ability to reread entire books. This is potentially debated Lawsuit by authors against OpenAIBecause the latter can claim that the snippets cited by its model are taken from reviews, first pages available online, etc. (The latest target-space is that they Did (You initially used copyrighted works but then stopped, which is funny because it’s like saying you only juiced oranges once but then stopped.)

what you can’t Can you claim that your music generator listened to only a few bars of “Great Balls of Fire” and somehow managed to recite the rest word for word and melody for melody. Any judge or jury would laugh in your face, and with any luck a court artist would get to paint it.

This is not only intuitively obvious, but also legally significant, because it is clear that the models are recreating entire works – sometimes poorly, but entire songs. This allows the RIAA to claim that Udio and Suno are causing real and major harm to the business of copyright holders and artists – which allows them to ask the judge to shut down the entire operations of the AI ​​companies with an injunction at the very beginning of the lawsuit.

Are the opening paragraphs of your book coming out of an LLM? That’s an intellectual issue that should be discussed in detail. Dollar-store “call me maybe” generated on demand? Stop it. I’m not saying it’s true, but it’s likely.

The expected response from companies has been that this system is not good. intended Copying copyrighted works: A desperate, naked attempt to foist liability on users under the Section 230 safe harbor. That is, in the same way that Instagram is not liable if you use a copyrighted song in support of your reel. Here, the argument seems unlikely to gain momentum, partly because of the above-mentioned admission that the company itself initially ignored copyright.

What will be the outcome of these lawsuits? As with all things related to AI, it is impossible to say anything in advance, as there is little precedent or applicable, established principle.

My prediction, in which again I have no real expertise, is that the companies will be forced to expose their training data and methods, these things being of obvious evidentiary interest. Given these and their obvious misuse of copyrighted material, as well as (it is likely) communications revealing they were breaking the law, there will likely be an attempt and/or early judgment to settle or avoid litigation against Udio and Suno. They will also be forced to cease any operations that rely on piracy-based models. At least one of the two will attempt to continue business using legally (or at least legally adjacent) sources of music, but the resulting model will be a huge step down in quality, and users will flee.

The investors? Ideally, they will lose all their deposits, because they have bet on something that is clearly and provably illegal and unethical, and not just in the eyes of the sleazy writers unions but also according to the legal minds at the notoriously and ruthlessly litigious RIAA. Whether the amount of damages will equal the cash on hand or the promised financing is anyone’s guess.

The consequences of this could be far-reaching: if investors in a new generative media startup suddenly see a hundred million dollars lost due to the fundamental nature of generative media, then suddenly a different level of diligence seems appropriate. Companies will learn from lawsuits (if any) or settlement documents etc. what could have been said, or perhaps more importantly, what should not have been said, in order to avoid liability and force copyright holders to guess.

While this particular lawsuit seems like a foregone conclusion, not every AI company leaves its mark so liberally around a crime scene. This would not be a strategy to sue or compromise other productive AI companies, but rather an object lesson in arrogance. It’s nice to have that from time to time, even if the teacher is the RIAA.

the-riaas-lawsuit-against-generative-music-startups-will-be-the-bloodbath-ai-needs-techcrunch