Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Crikey
Crikey
Entertainment
Sophie Cunningham

When I discovered AI had co-opted my books, I felt relief — then rage

Call me perverse, but when I found my name on the list of books that have been fed into Books3 — a database used to train generative AI programs — I felt relief. Finally, some transparency on the origins of AI! Finally, some understanding of the importance of copyright! Maybe now we could push back on claims, made by big tech and others, that Australia needs to loosen up its copyright laws if entrepreneurial tech is to flourish here. Maybe now calls for writers’ work, their copyright, and the pitiful state of their incomes can be taken seriously! Writers are also entrepreneurs and their copyright is one of the very few ways they have of making an income. As it is, most writers earn well below the poverty line, yet the flourishing of our culture depends on them. 

This newfound transparency is thanks to The Atlantic, which published a search tool that makes it possible for authors globally to check whether their work has been included in AI training data sets. Over the past few days, hundreds, if not thousands, of Australian writers have discovered that our country’s relatively robust copyright laws are rendered meaningless: in this case permission was neither sought, nor given. You don’t have to have been a best-selling writer and published in the USA. Two of my titles, City of Trees and Warning: The Story of Cyclone Tracy, turned up in the search and neither is available outside Australia.

My relief at what Booker Prize-winner Richard Flanagan described as “the biggest act of copyright theft in history” has slowly transformed into a simmering rage. This theft has, for the most part, happened twice over because illegally pirated books form the bulk of the data set that makes up Books3. So far some 190,000 books have been identified within Books3. That theft is only possible because of the contempt most of the world has for writers’ copyrights and (as a consequence, whether people admit it or not) for writers themselves.

Some of the companies that have trained their developing AI systems using Books3 argue that the use of this material constitutes “fair use” — the legal doctrine that permits the limited use of copyrighted material under certain circumstances (for example, in schools and universities). This argument can’t be automatically dismissed, however, the definition of fair use is stronger, indeed fairer, in Australia than in the US, and the guidelines of what constitutes fair use are clearer. Whatever its merits, the fair use position becomes weaker once you realise that the bulk of the data comes from pirated texts. To quote The Atlantic, “The future promised by AI is written with stolen words.”

It’s also argued that this data is being used for training purposes, not to produce “original” work. That argument too is problematic. If a work is “original” but reads as if it could have been written by, say, John Grisham, where does that leave John Grisham? He is obviously wondering the same thing, as one of the 17 authors who, with the support of the Authors Guild in America, has filed a class-action suit against OpenAI. The Australian Society of Authors (the ASA), an organisation of which I’m the chair, support the action of the Authors Guild in the US and is seeking legal advice, as well as talking to government and other agencies, to get a better understanding of what our next steps might be. 

The argument of originality also makes a mockery of what most authors and musicians do as a matter of course — apply for permission if they want to use other people’s words in their original works. Yes, getting permission is cumbersome, confusing (with different copyright rules across the globe), time-consuming and expensive, eating into an already precarious outcome. But artists do this work — I do this work — both because the law requires it of us and because we understand the need to acknowledge our colleagues’ contributions. 

My rage kicked up from a simmer to a boil when I read this sentence (again from The Atlantic): “High-quality generative AI requires higher-quality input than is usually found on the internet — that is, it requires the kind found in books.” This means that our work is being used, without permission, BECAUSE IT’S GOOD.

Writers spend years of their lives learning how to research, argue, compress information, communicate that information with clarity, and entertain and move people. They are, in other words, highly skilled at what they do. Their work informs the broader culture including theatre, television and film. Writers — and all artists — deserve a lot more than having their work stolen and then used to generate the extraordinary profits AI companies stand to make.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.