Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Fortune
Fortune
Sage Lazzaro

The promise, and perils, of visual generative AI technology

(Credit: Ed Jones/AFP via Getty Images)

Hello and welcome to Eye on AI.

This past week was a big one for the visual side of generative AI, thanks to launches from Microsoft/OpenAI, Canva, and Google. Altogether, they show how imperfect (and ripe for abuse) these generative AI tools still are, but also paint a picture of rapid technological advancement.

Microsoft and OpenAI made a splash by making DALL-E 3 generally available to the masses via Bing Chat. The release comes even before DALL-E 3’s anticipated launch within ChatGPT, which is scheduled for later this month for paying users. The integration within Bing Chat—as well as the planned launch for ChatGPT—also introduces the capability for users to refine their images by conversing with that chatbot, as opposed to using the tool as a standalone product. 

It didn’t take long, however, for DALL-E 3 to take centerstage in a coordinated 4chan campaign to flood the internet with racist images. As reported by 404 Media, 4chan users decided to make “propaganda for fun” and created a visual guide for how to use AI tools to quickly make images with “redpill messaging” and other racist content. While the guide says people can use any program they want such as Stable Diffusion or Photoshop, it says “Most people are using DALL-E 3” and links to the Bing tool, calling it the “QUICK METHOD.” 

OpenAI has placed limitations on its AI tools to prevent the generation of racist and other offensive content, but users are of course finding ways around them. In turn, these images are now swirling around the internet, destined to inform current and future AI models as they scrape training data and increasingly browse the internet in real time. 

Moving on to Canva, the online design platform launched Magic Studio, an extensive suite of AI-powered tools and capabilities that includes a text-to-image generator, text-to-video generator, the ability to generate entire projects from a line of text, generate copy in your brand voice, translate copy into different languages, and automatically switch between formats (such as automatically transforming the content of a slideshow into a document, for example). That’s on top of a host of interesting new photo editing capabilities such as Magic Grab, which allows users to select and separate the subject of a photo and make it into an editable element that can be individually edited, repositioned, or resized.

But it’s clear some of these features, which are powered by the company’s partnership with OpenAI, still have a ways to go. I played with the text-to-video tool and the results ranged from slightly disturbing to totally nonsensical. When I prompted it to generate a video of “A cat birdwatching in a sunny windowsill,” it did—except the cat’s butt and tail were sitting beside it on the windowsill rather than attached to its body. And when I prompted the tool to generate a video of “UFOs abducting chefs from Earth,” the two-second clip it produced was so utterly confusing that I had no idea what I was looking at. 

That brings us to the unveiling of Google’s new Pixel 8 and Pixel 8 Pro smartphones, which launched to much fanfare thanks to new AI-powered photo editing capabilities. Google Pixel can now erase unwanted audio from video, edit specific elements within a photo (essentially the same as Canva’s Magic Grab), and combine different frames of a photo to help you create the best shot. If one person is making an unflattering face in a group photo, for example, Google Pixel now lets you simply choose a better face from recent images or frames and swap it in. While these types of edits have always been possible for people with expertise in Photoshop, now they’re available to everyone with a click and in the palm of their hands.

“In all my years of reviewing personal technology gadgets, I can count the number of times my jaw has dropped when learning about a new product. It’s good to be a skeptical journalist! But I failed to maintain that detachment when Google demoed a few imaging tricks on its new Pixel 8 and Pixel 8 Pro smartphones,” wrote Julian Chokkattu, the reviews editor at Wired.

And indeed it is pretty impressive. It wasn’t that long ago—2016—when I covered the launch of Google’s first AI photo feature, the “enhance” tool, which basically just adjusted the lighting and sharpness of a photo. Even as recently as two months ago, I stopped in a Google store to check out the current Pixel offerings and play with the Magic Erase tool, which neither I nor the store associate could get to work. Now just a few months later, we have a suite of AI tools that goes above and beyond Magic Eraser. And while it’s likely the new face-swapping capability is still imperfect (not to mention a bit dystopian), it sure shows just how far this technology has come.

Sage Lazzaro
sage.lazzaro@consultant.fortune.com
sagelazzaro.com

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.