Get all your news in one place.
100's of premium titles.
One app.
Start reading
International Business Times UK
International Business Times UK
World
Chelsie Napiza

Researchers Challenge OpenAI Defence After Claiming ChatGPT Can Output Near-Verbatim Copies of Published Books

A new peer-reviewed study claims that finetuning GPT-4o, Google's Gemini-2.5-Pro, and DeepSeek-V3.1 allows researchers to extract up to 90% of copyrighted books in near-verbatim form. The findings directly challenge the legal defences that OpenAI and other AI companies have used in dozens of active copyright lawsuits.

The paper, titled 'Alignment Whack-a-Mole: Finetuning Activates Verbatim Recall of Copyrighted Books in Large Language Models,' was submitted to arXiv on 21 March 2026 and revised on 25 March 2026. Its authors are Xinyue Liu, Niloofar Mireshghallah, Jane C. Ginsburg and Tuhin Chakrabarty, researchers whose combined backgrounds span computer science, machine learning and copyright law.

The timing is significant. As Norton Rose Fulbright noted in a 2026 litigation update, OpenAI has consistently asserted in court filings that its outputs do not substantially reproduce plaintiffs' works, and has argued that safety alignment measures prevent verbatim regurgitation. The new research alleges that such protections can be bypassed with minimal effort.

Finetuning as a Back Door Through Alignment

The research team's method is both technically elegant and commercially plausible. Rather than prompting a model to reproduce a book directly,something alignment guardrails are specifically designed to block, the researchers finetuned each model to expand plot summaries into full prose.

Writing-assistant tools that do exactly this kind of task are already widely sold commercially. Using only semantic descriptions of a book's plot, without providing any actual book text as input, the researchers caused the finetuned models to reproduce up to 85 to 90% of copyrighted books, with individual verbatim spans exceeding 460 words.

The paper makes clear that the key mechanism is not prompt engineering but the finetuning process. According to the paper's abstract on arXiv, 'finetuning bypasses these protections' because the training weight adjustments reactivate memories of copyrighted text already embedded in the model from pretraining.

The authors describe this dynamic as a 'Whack-a-Mole' problem. Suppressing verbatim output in one context does not remove the underlying data from the model's weights. I simply makes retrieval harder in one direction while leaving it wide open in others.

The effect is not limited to a single author. The researchers finetuned models exclusively on the novels of Haruki Murakami and then tested extraction on books by more than 30 unrelated authors. The cross-author generalisation remained robust.

As the paper explains, finetuning on Virginia Woolf's widely digitised public-domain novels produced extraction rates comparable to the Murakami-trained condition. Finetuning on purely synthetic stories that were never part of any pretraining corpus produced 'virtually no long verbatim spans.' The implication is clear: finetuning unlocks material already memorised during pretraining, not content injected later.

What OpenAI Has Told Courts About Memorisation

The study directly targets a position that AI companies have staked out repeatedly in litigation. According to the paper's own framing, 'frontier LLM companies have repeatedly assured courts and regulators that their models do not store copies of training data.' They have further 'cited the efficacy' of safety alignment measures 'in their legal defenses against copyright infringement claims.'

That legal position is well documented. As Ropes and Gray's litigation tracker noted, 'defendants including OpenAI, Microsoft, Bloomberg and GitHub have asserted that their use of copyrighted materials is permissible because the AI model outputs merely build upon copyrighted works, rather than replicating protected expressions.' OpenAI has also argued in court that some material used is not protected by copyright, citing fair use defences including transformativeness and de minimis copying.

Courts have not settled these questions yet. As the Copyright Alliance documented in January 2026, fair use rulings in AI training cases remain split: two judges have found in defendants' favour, one against. No further summary judgment decisions on AI fair use are expected until summer 2026 at the earliest. The paper arrives into that vacuum, furnishing evidence that safety filters may not constitute a meaningful technical barrier to verbatim reproduction.

Discovery proceedings have already been squeezing OpenAI. On 5 January 2026, US District Judge Sidney Stein affirmed an order compelling OpenAI to produce 20 million anonymised ChatGPT conversation logs in the consolidated multidistrict litigation in the Southern District of New York, a case that combines 16 separate copyright lawsuits from news organisations and authors. OpenAI had attempted to limit its disclosure to logs specifically mentioning the plaintiffs' works. The court rejected that approach.

Copyright Scholars Behind the Paper

The authorship of this paper sets it apart from purely technical memorisation research. Jane C. Ginsburg is one of the most cited copyright scholars in the United States, based at Columbia Law School. Her co-authorship alongside Niloofar Mireshghallah, a machine learning researcher whose profile at the University of Washington lists extensive prior work on LLM memorisation and privacy, and Tuhin Chakrabarty, an assistant professor of computer science at Stony Brook University whose own website lists New Yorker and Literary Hub coverage of related work, makes the paper simultaneously a technical intervention and a legal argument.

The paper's own legal analysis explicitly frames the findings in terms of copyright territoriality. It notes that in Getty Images v. Stability AI, EWHC 2863 (Ch) (2025), the English High Court found no infringing acts in the United Kingdom because Stable Diffusion was found not to 'store the data on which it was trained.' The paper argues that had the evidence shown model weights retain copies rather than merely 'learned the statistics of patterns,' the court would likely have found otherwise.

The implication for the UK is clear. If this methodology shows that GPT-4o retains verbatim copies of works accessible in the country, British courts may have grounds to hear infringement claims under domestic copyright law.

The paper's authors also note that nearly every frontier language model was trained on copyrighted books obtained from pirated sources, citing materials obtained from shadow libraries including LibGen and Books3, which comprised more than 190,000 copyrighted titles. This sits alongside broader evidence of what the Copyright Alliance describes as 'more than 70 infringement lawsuits' now pending against AI companies in US courts, with a new wave of class certification battles anticipated throughout 2026.

If courts accept the argument that safety filters are a legal shield rather than a technical reality, this paper may be the clearest evidence yet that the shield was always paper-thin.

Sign up to read this article
Read news from 100's of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.