Google accused of using ‘hundreds of millions’ of…

Google accused of using ‘hundreds of millions’ of people’s data to train AI bot

A class action lawsuit has been filed in the US, claiming Google has used “hundreds of millions” of people’s data in order to feed its AI tech, including the Bard chatbot.

The lawsuit was filed by eight claimants “on behalf” of the rest of the US population.

It claims: “Google has been secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans” including “personal and professional information, our creative and copywritten works, our photographs, and even our emails – virtually the entirety of our digital footprint”.

The eight people who filed the suit are only represented by their initials, but details elucidated in the filing, which has been rehosted by the Register, reveal one of the plaintiffs is six years old, another 13.

One of the claimants is an “actor and professor”, one an author, another regularly posts on YouTube and TikTok.

The documents detail how the author’s work can be reproduced when using Google Bard. “On demand, Bard will offer not only to summarize the book in detail, chapter by chapter, but it also offers to regenerate the text of her book verbatim,” the case document reads.

The group is suing Google parent company Alphabet for $5 billion (£3.8 billion).

Their lead claim is “Publicly available” has never meant free to use for any purpose and that “Google must understand, once and for all: it does not own the internet, it does not own our creative works, it does not own our expressions of our personhood, pictures of our families and children, or anything else simply because we share it online.”

Google is attacked on 10 fronts in the case, including violations of the Digital Millennium Copyright Act, the 1998 US law that provides the mechanisms by which copyright-infringing content is removed from platforms like YouTube.

Earlier this month Google updated the wording of its privacy policy to more clearly state it does indeed implement user data in the development of its products.

“Google uses information to improve our services and to develop new products, features and technologies that benefit our users and the public. For example, we use publicly available information to help train Google’s AI models and build products and features like Google Translate, Bard, and Cloud AI capabilities,” its policy read in early July according to Search Engine Journal, although the wording appears to have changed once again since.

Google’s privacy policy still says your data is used in “developing new products and features that are useful for our users”.

This is not the first case to be levelled at the companies developing generative AI technologies.

Last January, Getty Images sued Stable Diffusion creator StabilityAI in London, claiming it had used the Getty’s copyrighted pictures to train its generative AI models. This was followed up in February by a similar filing in the US.

Read news from 100's of titles, curated specifically for you.

Already a member? Sign in here