Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Library

Arthur Besse@lemmy.ml · edit-2 2 years ago

Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Library

Dominic@beehaw.org · edit-2 2 years ago

There are a few reasons why music models haven’t exploded the way that large-language models and generative image models have. Maybe the strength of the copyright-holders is part of it, but I think that the technical issues are a bigger obstacle right now.

Generative models are extremely data-inefficient. The Internet is loaded with text and images, but there isn’t as much music.
Language and vision are the two problems that machine learning researchers have been obsessed with for decades. They built up “good” datasets for these problems and “good” benchmarks for models. They also did a lot of work on figuring out how to encode these types of data to make them easier for machine learning models. (I’m particularly thinking of all of the research done on word embeddings, which are still pivotal to large language models.)

Even still, there are fairly impressive models for generative music.

Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Library

Sarah Silverman and other authors are suing OpenAI and Meta for copyright infringement, alleging that they're training their LLMs on books via Library Genesis and Z-Library

Sarah Silverman Sues ChatGPT Creator for Copyright Infringement