Australian media companies could seek compensation from Meta for its use of online news sources in training generative AI technology, researchers have said.
When Meta announced last week that it would not sign new deals to pay for news in Australia for use on Facebook, it downplayed the value of news to its services, stating that just 3% of Facebook usage in Australia was related to news.
There are now calls to designate Meta under the news media bargaining code, which would force the company to negotiate with news media publishers and pay for news content on its platforms, or face fines of 10% of its annual Australian revenue.
Meanwhile, tech companies including Meta have been training their generative AI models on massive amounts of online information. In Meta’s discussion paper for its own Llama 2 model, the company said it had used “publicly available online sources”.
When asked whether those sources included online news, a Meta spokesperson declined to comment. Guardian Australia understands the company has used online sources on the basis of fair use allowances under US copyright law.
The New York Times sued the world’s leading AI company, OpenAI, in December, accusing it of using millions of its articles without permission to train chatbots, which then provided the information to users. OpenAI last month sought to have parts of the case dismissed, claiming the newspaper “hacked” ChatGPT and other artificial intelligence systems to generate misleading evidence for the case.
Prof Monica Attard and Dr Michael Davis from the University of Technology’s Centre for Media Transition said they saw some parallels between the justification for the news media bargaining code in 2021, and how media were grappling with the rise of AI technologies. They said the code could become an avenue to facilitate payments for the use of news to train AI models.
“In the case of AI, the news businesses have the option of turning off AI crawlers, in OpenAI’s case at least,” they told Guardian Australia. “Still, the code doesn’t specify which platforms might be designated, or even which services. So an AI service could feasibly be brought within the auspices of the code.”
The pair noted that companies such as News Corp had publicly referred to negotiations taking place over payment for AI training data, while others have blocked AI from scanning their site. Attard and Davis argued it was valuable for generative AI to be trained on news information.
“Because news archives are such a rich, reliable data base for AI training, they have extra value as a training source for AI companies and platforms. And in terms of the known risks to the information environment, its clearly important that AI is trained on high quality data sources,” they said.
Reset Australia, a technology research organisation, argued that a new scheme might be needed rather than trying to use the code to account for AI.
The CEO of the Public Interest Journalism Initiative, Anna Draffin, said AI was having a profound impact on all sectors including public interest journalism, and that the group was still considering how public interest journalism could evolve with the rapid growth of AI technology.
Attard and Davis said the bargaining code might be one way to pay news publishers, but noted it was also something being examined in copyright law.
In December, the attorney general, Mark Dreyfus, announced the establishment of a copyright and AI references group to examine the topic.
At an industry roundtable late last year, there were “very different views” on policy, with some wanting exceptions in copyright for text and data mining for AI, while others said the content should be licensed and rights holders should be compensated.
The group has yet to be named or meet on the issue.