Anonymous perps behind 86 million files scraped from…

Anonymous perps behind 86 million files scraped from Spotify hit with $322 million court judgement — Anna's Archive case presents intriguing precedent for AI training

A U.S. federal judge on Tuesday awarded Spotify and the three major labels $322 million in a default judgment against Anna's Archive, but only $22.2 million of that figure came from copyright infringement. The remaining $300 million was awarded to Spotify alone under the Digital Millennium Copyright Act's anti-circumvention provisions, a claim that doesn’t require the plaintiff to own the underlying works.

Judge Jed S. Rakoff of the Southern District of New York entered the judgment after the anonymous operators of Anna's Archive failed to appear. The site had announced in December that it scraped 86 million files from Spotify and intended to distribute them via BitTorrent, prompting a January lawsuit from Spotify, Universal Music Group, Sony Music Entertainment, and Warner Music Group.

The labels' direct copyright claim covered 148 identified works at the statutory maximum of $150,000 per work, totaling $22.2 million, split between Sony, UMG, and Atlantic, which is a small infringement case by major-label standards.

Spotify's award was calculated differently because the service doesn’t own the recordings on its platform, so it couldn’t bring a direct infringement claim. Instead, it argued that Anna’s Archive bypassed its technological protection measures, the authentication and anti-scraping systems that gate access to its audio files, in violation of DMCA §1201. Judge Rakoff applied the statutory maximum of $2,500 per circumvention to the 120,000 files Spotify's lawyers downloaded as evidence, producing the $300 million figure. Notably, these damages don’t depend on what Anna's Archive subsequently did with the files, but on the act of bypassing access controls.

This could set an interesting precedent, with any platform that gates content behind authentication now being able to argue that scraping constitutes circumvention under §1201, with statutory damages of up to $2,500 per file. Ownership of the underlying content isn’t required, nor is demonstrable harm or loss.

That may well have applications to AI training datasets. Anna’s Archive has previously called its data scrape a “preservation archive,” language that closely mirrors the justifications offered by AI labs for retaining scraped content. Nvidia is already defending itself in Nazemian v Nvidia against claims that it trained models on books sourced from Anna's Archive, with the plaintiffs' amended complaint citing internal correspondence in which the company's data strategy team allegedly negotiated for high-speed access to roughly 500TB of material.

That case is currently pleaded as direct infringement, and the Spotify ruling adds §1201 to the toolkit for any plaintiff whose source content sat behind authentication, which covers most of the commercial web AI labs have scraped.

It’s unlikely that Spotify will ever be able to collect from Anna’s Archive, given its anonymity and how it has previously relaunched on new domains following enforcement actions, but that’s beside the point; the judgment’s value is in the precedent it may well have set for the next defendant, who won’t be anonymous.

Read news from 100's of titles, curated specifically for you.

Already a member? Sign in here