Get all your news in one place.
100’s of premium titles.
One app.
Start reading
PC Gamer
PC Gamer
Harvey Randall

Stand-up comedian and actor Sarah Silverman joins trio suing OpenAI and Meta over claims their AI models 'ingested and used' copyrighted work without permission

OpenAI logo on some cash.

On July 7 Sarah Silverman, a stand-up comedian—also known for her acting work as the voice of Vanellope in the Wreck-It Ralph movies—joined authors Christopher Golden and Richard Kadrey in twin lawsuits against OpenAI and Meta.

As reported by The Verge earlier this week, the suit concerns Silverman's written work, with all three claiming that both ChatGPT and LLaMA (Meta's own large language model program) had been trained on data harvested from “shadow library” sites such as "Bibliotik, Library Genesis, Z-Library, and others." 

The OpenAI suit offers a trio of exhibits, which demonstrate the model's ability to summarise copyrighted books with very few mistakes. These include The Bedwetter, a memoir by Silverman, Ararat, a horror-thriller by Christopher Golden, and Sandman Slim, a supernatural fantasy noir thriller by Richard Kadrey. 

In short—they'd been caught in the program's net at some point, which the suit claims is an infringement of copyright: "Defendants, by and through the use of ChatGPT, benefit commercial and profit richly from the use of Plaintiffs’ and Class members’ copyrighted materials."

Meanwhile the suit against Meta alleges that those same books, as well as several others, were found in the datasets used to train LLaMA. The complaint mentions ThePile in particular, which was created by a company named EleutherAI.

The suit quotes EleutherAI's own description of its dataset as using Bibliotik, one of several "shadow libraries" the suit condemns: "Bibliotik consists of a mix of fiction and nonfiction books [...] We included Bibliotik because books are invaluable for long-range context modelling research and coherent storytelling."

The suit then explains: "These shadow libraries have long been of interest to the AI-training community because of the large quantity of copyrighted material they host. For that reason, these shadow libraries are also flagrantly illegal."

The author's representatives, lawyers Matthew Butterick and Joseph Saveri, write on their litigation website: "Much of the mate­r­ial in the train­ing datasets used by OpenAI and Meta comes from copy­righted works—includ­ing books writ­ten by Plain­tiffs—that were copied by OpenAI and Meta with­out con­sent, with­out credit, and with­out com­pen­sa­tion."

These three authors join a growing furore around the use of AI. Earlier this year, a class-action lawsuit was filed against StabilityAI, Midjourney, and DeviantArt. Just this month, I've reported on the growing concerns in the voice acting and modding community about the use of AI in pornographic voice mods, as well as Unity's unpopular new AI tools

While this tech might have a use in game development, it's clear that the law's scrambling to catch up. It'll be interesting to see the result of this suit—as well as the others that are sure to follow—as AI becomes more and more of a large language elephant in the room across multiple industries.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.