Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Creative Bloq
Creative Bloq
Technology
Daniel John

Controversial Nvidia AI leak prompts calls for new laws

Nvidia logo.

Generative AI has been plagued by controversy over the last few years, with questions around copyright and ethics surrounding every model from Adobe Firefly to ChatGPT. And perhaps the most contentious issue is the unauthorised scraping of data in order to train AI. 

A new leak suggests that Nvidia has flagrantly scraped data from YouTube videos and more, with internal Slack and email conversations showing concerned employees being assured that the practice had clearance from the "highest levels" of the company. For more AI content, take a look at our coverage from Creative Bloq's first ever AI week

(Image credit: SOPA Images via Getty)

According to the report by 404, Nvidia employees were " attempting to download full-length videos from a variety of sources including Netflix, but were focused on YouTube videos. Emails viewed by 404 Media show project managers discussing using 20 to 30 virtual machines in Amazon Web Services to download 80 years-worth of videos per day." What's more, the internal messages 

It certainly sounds like industrial scale content scraping, and it's already drawing the ire of notable YouTubers. The leaked messages show employees directly referencing training date on videos by tech vlogger Marques Brownlee, who took to X to share his displeasure (below). 

Of particular note here seems to be the scraping of Netflix videos. As Reddit users have pointed out, this isn't exactly free and open content. "A big corporation engaging in a piracy scheme is somehow perfectly fine, probably aided by a fancy scraping mechanisms and who knows how DRM avoidance mechanism," one Redditor comments. "Meanwhile, the common folk have to endure the full extent of the law with severe punishment if you ever dare to add 1 minute worth of Shrek footage in a youtube video essay."

Nvidia has defended its practice as being "in full compliance with the letter and spirit of copyright law" – but many are calling for laws to be changed in response to the advent of AI. "They should legally require your consent to train on your videos," one X user responds to Brownlee's post while another adds, "We really need laws to catch up the the times". 

This isn't the first time a leak has revealed the true extent of data scraping from AI models. An internal Midjourney document containing the names of over 16,000 artists emerged a few months ago, while Google's own scraping of YouTube recently caused controversy. Indeed, even those who have claimed their models to be ethically managed have fallen foul of artists, with Adobe facing accusations of copyright infringement last year. 

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.