YouTube creator David Millette (via Legal Dive) has sued AI giant Nvidia for using their videos to train their AI models. This move comes weeks after they sued OpenAI for the same reason. However, Millette does not allege copyright infringement against AI companies — something many publications, including the New York Times, levied against OpenAI and Microsoft last year.
Instead, Millette charged Nvidia with “unjust enrichment and competition,” saying that its practice of scraping the internet for data to train its AI was “unfair, immoral, unethical, oppressive, unscrupulous, or injurious to consumers,” according to the lawsuit. This lawsuit was filed after Nvidia was accused of scraping over 400,000 hours of video per day to train its own AI model, with one leaked email purportedly showing that the company plans to use the gathered data as an accelerated pipeline for clients that want to build and train their own AI models.
Nvidia responded to the lawsuit: “Anyone is free to learn facts and ideas from publicly available sources. Creating new and transformative works is not only fair and just, but exactly what our legal system encourages.” This is why cases charging AI companies with copyright infringement often must go through many loopholes, especially as Google asserts that AI scraping is ‘Fair Use.’
On the other hand, Millette claims’ unjust enrichment’ against Nvidia and OpenAI, which is different from copyright infringement. Mandarin Trading Ltd. v. Wildenstein (2011) states, “The doctrine of unjust enrichment allows a plaintiff to recover from a defendant, without the benefit of enforceable contractual obligation, where the defendant has unfairly benefited from the plaintiff’s efforts without compensation.” The case further adds, “The elements of an unjust enrichment claim are “that (1) the other party was enriched, (2) at that party’s expense, and (3) that it is against equity and good conscience to permit the other party to retain what is sought to be recovered.”
Data scraping has often been contentious, whether for AI or other uses. Now that it’s being used to train AI large language models (LLMs) that could potentially supplant human creativity, many creators are up in arms against the unauthorized use of their output for training them.
Unfortunately, the law about scraping data online for use in AI training is still unclear. As long as it’s not against the law, companies will take advantage of this legal gray area to gain an advantage.