The ethics of generative AI has been in the news this week. AI companies have been accused of taking copyrighted creative works without permission to train their models, and there's been documentation of those models producing outputs that plagiarize from that training data. Today, I’m going to make the case that generative AI can never be ethical as long as three issues that are currently inherent to the technology remain. First, there’s the fact that generative AI was created using stolen data. Second, it's built on exploitative labor. And third, it's exponentially worsening the energy crisis at a pivotal time when we need to be scaling back, not accelerating, our energy demands and environmental impact.
The first unethical truth of generative AI has probably gotten the most attention. Over the past few years, there’s been a constant stream of lawsuits, open letters, and labor disputes over the technology from artists, writers, publishers, musicians, and other creatives whose work was used to train large language models. This issue and its impact on the publishing and music industries has been playing out this week after generative AI search engine Perplexity was caught scraping publishers’ content against their wishes (after it promised it wouldn’t) and the Recording Industry Association of America filed copyright infringement lawsuits against two well-funded AI companies, as my Eye on AI cowriter Jeremy Kahn covered thoroughly in Tuesday’s newsletter (and I have updates on the music story later). If you haven’t already read his essay, go and give it a read.
But what’s important to know beyond why these data practices are so unethical is just how deeply tied they are to generative AI as it exists today. AI companies hoovered up content en masse—including copyrighted works—to train their models because it was the only way they could build this technology that they were so invested in and so determined to build. The New York Times recently referred to this as “the original sin of AI” and reported on how OpenAI, for example, even built a tool specifically to transcribe the audio from podcasts, audiobooks, and YouTube videos into text in order to train ChatGPT.
AI companies are increasingly seeking out deals to license data to train models. It’s still not clear if this can solve the problem—or what new ones it might create. But agreeing to pay for training data now that they’re under fire certainly feels like an admission that they didn’t have the right to take it all along.
Now onto generative AI’s labor problem. These models yet again would not exist without the labor of workers—many of them in the Global South—who are paid little more than $1 per hour to make all the data that was scraped usable for AI models. This work sometimes involves labeling images, video footage, and other data, including content that is horrifically violent. In other cases, these data workers help refine the outputs of large language models, evaluating LLM outputs and guiding the models to be helpful and avoid racist, sexist, or other harmful outputs through a process known as “reinforcement learning through human feedback,” or RLHF. It is this process that takes a raw model, like GPT-4, and turns it into ChatGPT.
In an open letter to President Joe Biden I reported on last month, Kenyan AI data labelers who work for companies including Meta, OpenAI, and Scale AI said U.S. tech companies are systematically exploiting African workers and accused them of undermining local labor laws, comparing their working conditions to “modern day slavery.”
“Our work involves watching murder and beheadings, child abuse and rape, pornography and bestiality, often for more than 8 hours a day. Many of us do this work for less than $2 per hour,” they wrote.
Of course, this isn’t news. Reports of the exploitation of low-paid AI workers go back years, such as one in Rest of World—an excerpt from Phil Jones’s book Work Without the Worker— that in 2021 detailed how Big Tech companies have targeted populations devastated by war and economic collapse to perform the work: “Their nights are spent labeling footage of urban areas—‘house,’ ‘shop,’ ‘car’—labels that, in a grim twist of fate, map the streets where the labelers once lived, perhaps for automated drone systems that will later drop their payloads on those very same streets.” It also closely mirrors the content moderation practices of social media platforms, which have similarly come under fire for their miniscule pay and hostile work conditions.
When AI executives and investors compare the technology to magic, they never mention these workers and the intense labor they perform to make it possible. If the taking of data without consent is generative AI’s original sin, the widespread labor exploitation is its dirty secret.
Lastly, there’s the energy usage, water consumption, and sustainability problems of generative AI.
You’ve undoubtedly heard the dire climate warnings and seen coverage of record-breaking heat and countless climate disasters over the past few years. If you haven’t read about how the demands for AI's growth-at-all-costs is wreaking havoc on the global energy supply, pushing grids to the brink, guzzling up massive amounts of water, and threatening to upend the energy transition plans needed to ward off the worst effects of climate change, Bloomberg and Vox both had great stories on the topic this past weekend.
Datacenters are already using more energy than entire countries—and AI is behind much of the recent growth in their power consumption—a trend that’s only expected to get worse as companies race to create larger AI models trained on more and more increasingly powerful chips. The size of AI datacenters is ballooning, using 10 to 15 times the amount of electricity, one energy executive told Bloomberg. Nvidia’s next-generation GPU for generative AI, the B100, can consume nearly twice as much power as the H100 that is used for today’s leading models. OpenAI CEO Sam Altman has said we need an energy breakthrough for AI, and even Microsoft has admitted AI is jeopardizing its long-term goal to be carbon-negative by 2030. Projections for the future of AI’s energy usage are bleak: In the U.K., for example, AI is expected to increase energy demands by 500% over the next decade.
“I'm sure you've read about these 100,000 GPU supercomputers that people want to build, all of the money that's going into them, the power constraints, the environmental and water constraints. I've not been able to stop thinking about how potentially unsustainable [it is],” Chris Kauffman, a principal investor at VC firm General Catalyst, which is invested in AI companies including Adept and Mistral, told me.
These issues don’t apply to all types of AI models. But it’s also not a comprehensive list of AI’s ethical issues. For instance, bias in facial recognition systems and models used to convict people of crimes, deny them loans and insurance, and decide who gets job interviews and admitted into colleges are another dimension of how the way in which AI models are built, and the way in which we are using them, is failing to live up to our ethical ideals.
AI companies say they’re committed to developing AI responsibly. They’ve signed a bunch of (non-binding) pledges to do so. These practices, however, suggest otherwise.
And with that, here’s more AI news.
Sage Lazzaro
sage.lazzaro@consultant.fortune.com
sagelazzaro.com