When David Holz, founder and CEO of A.I. art generator Midjourney, thinks about the future of his industry’s technology, he likens it to water.
Water can be dangerous. We can easily drown in water. But it has no intent. And the challenge that humans face when it comes to water is learning to swim, building boats and dams, and finding ways to wield its power.
“You can make two images, and it’s cool, but you make 100,000 images, and you have an actual physical sensation of drowning,” says Holz in an interview with Fortune. “So we are trying to figure out how do you teach people to swim? And how do you build these boats that let them navigate and be empowered and sort of sail the ocean of imagination, instead of just drowning?”
A.I. image generators have proliferated across Silicon Valley and have gone viral across social media. Just several weeks ago, it became nearly impossible to scroll through Instagram without seeing Lensa AI’s “magical avatars,” which are colorful digital selfies made with an A.I.-powered editing app.
“In the last 12 months, the development of these technologies has been quite immense,” says Mhairi Aitken, an ethics fellow at The Alan Turing Institute, the U.K.’s national institute for data science and A.I. “Users are using [A.I. image generators] to generate a particular output without needing to necessarily understand what is the process for which that’s been created, or the technology behind it.”
The models behind these A.I. image generators are permeating smartphones because recent discoveries deepen the ability of the models to understand language and also create more realistic photos. “You’re teaching the system to become familiar with lots of elements of the world,” explains Holz.
As a result, practically any user can design, process, and rework their own facial features in images uploaded to apps like Lensa AI, which launched late last year and already has more than a million subscribers. In the future, Lensa says it is looking to evolve the model into a one-stop-shop that can address all of users’ needs around visual content creation and photography.
A.I. generated art initially surfaced in the 1960s, but many of the models used today are in their infancy. Midjourney, DALL E 2, and Imagen—some of the more well known players in the space—all debuted in 2022. Some of the world’s largest tech giants are paying close attention. Google’s text-to-image beta A.I. model is Imagen, while there are reports that Microsoft is mulling an investment of $10 billion in OpenAI, whose models include chatbot ChatGPT and DALL E 2.
“These are some of the largest, most complicated A.I. models ever deployed in a consumer way,” says Holz. “It’s the first time a regular person is coming into contact with these huge and complex new A.I. models, which are going to define the next decade.”
But the new tech is also raising ethical questions about potential online harassment, deepfakes, consent, the hypersexualization of women, and the copyright and job security of visual artists.
Holz acknowledges that A.I. image generators, as is the case with most new tech advancements, contain a lot of male bias. The humans behind these models still have work to do to figure out the rules behind A.I. image generation and more women should have a deciding role in how this technology evolves.
At Midjourney, there was a discussion about if the lab should allow users to upload sexualized images. Take the example of a woman wearing a bikini on the beach. Should that be allowed? Midjourney brought together a group of women to ultimately decide that yes, the community could create images with bikinis, but that those images would be private to the user and not shared across the entire system.
“I didn’t want to hear a single dude’s opinion on this,” says Holz. Specific phrases are blocked by Midjourney to prevent harmful images from proliferating within the system.
“Midjourney is trying to be a safe space for a wide variety of ages, and all genders,” Holz says. “We are definitely more the Disney of the space.”
On one hand, a bleak argument could be made that A.I. image generators—which again, don’t have human intent—are merely reflecting our society back to us. But Aitken says that isn’t good enough. “It shouldn’t just be a matter of taking the data that is available and saying, ‘That’s how it is,’” says Aitken. “We’re making choices about the data and whose experiences are being represented.”
Aitken adds that "we need to think more about the representation within the tech industry. And can we ensure greater diversity within those processes, because it is often the case that when biases emerge in datasets, it’s because they just haven’t been anticipated in the design process or development process.”
Concerns about how these models can be used for harassment, the promotion of bias, or the creation of harmful images have led to calls for greater guardrails. Google’s own research shows some mixed views about the societal impact of text-to-image generation. Those concerns were large enough that the tech giant opted not to publicly release the code or demo of Imagen. Governments may also need to step in with regulation. China’s Cyberspace Administration of China has a new law that went into effect in January that requires A.I. images to be watermarked and consent from individuals before a deepfake is made of them.
Visual artists have also expressed concern about how this new technology infringes on their rights, or could even take away work they had previously been paid for. The San Francisco Ballet recently experimented with Midjourney tech when creating a digital, A.I. image for their production of The Nutcracker. Users flooded the social media post on Instagram with complaints.
In January, a group of A.I. image generators—including Midjourney—were named in a lawsuit that alleged the dataset used for their products were trained on “billions of copyrighted images” and downloaded and used without compensation or consent from the artists. The lawsuit alleges violations of California’s unfair competition laws and protection of artists’ intellectual property is similar to what occurred when streaming music tech emerged. The lawsuit was filed after Fortune’s interview with Midjourney, and the publication has reached out to Midjourney for further comment.
Holz says most of the people using Midjourney aren’t artists, and very few people are selling images made from the model.
“It’s almost like the word A.I. is toxic, because we kind of implicitly assume that it's here to replace us and kill us,” says Holz. “One important thing is to figure out how we make people better, rather than how we replace people.”