Google has launched a new AI image generator called ImageFX that will be available through Labs and is built on top of the Imagen 2 model powering AI image creation in Bard.
ImageFX is part of a growing family of tools from Google aimed at specific aspects of generative artificial intelligence, joining MusicFX for music generation and TextFX for stylized text.
Imagen 2 was built by Google’s advanced AI lab DeepMind and can generate a range of images and styles from a simple text prompt. The company claims it produces “the highest quality images yet” compared to previous models.
As well as the new ImageFX standalone tool and Bard, Imagen 2 is also coming to Duet AI the generative artificial intelligence service built into Workspace apps like Docs and Slides.
What is ImageFX and how does it work?
There are a multitude of AI image-generation tools on the market with a range of special features, training datasets and price points. What makes ImageFX standout for me is the novel approach Google has taken to refining the prompt.
Prompt creation is one of the most important skills in creating anything using generative AI. The better your prompt the closer the output will be to what's in your mind.
With ImageFX Google has taken some of the user interface techniques developed for MusicFX, namely “expressive chips” that let you quickly experiment with changes.
You type out the prompt you want, for example “a photorealistic depiction of a dog riding a surfboard” and it will highlight the key words such as photorealistic, dog and surfboard, easily allowing you to select alternatives for each keyword from a dropdown menu.
Google says this, in combination with the underlying Imagen model, “delivers our highest-quality images yet, as well as improvements in areas that text-to-image systems often struggle with and keeps images free of distracting visual artefacts.”
Image safety and misinformation
To combat the risk of ImageFX or any tool with Imagen 2 as its base model from being used to create misinformation or deepfake images, Google has added SynthID to generated pictures.
SynthID was a technique built by DeepMind that embeds a hidden-to-humans watermark in generated content. This works across music, video and images. In an image it is hidden in the pixels and can’t be easily cut out or removed and can be used to identify the artificial origins of even the most realistic image.
In addition to SynthID Google has worked to ensure its training data has guardrails in place that limit the output of violent, offensive or sexually explicit content. There are also then further filters to stop the model from generating images of known or named individuals.
“We also conduct extensive adversarial testing to identify and mitigate potential harmful and problematic content,” a spokesperson explained. “In addition, all images generated using ImageFX include IPTC metadata, giving people more information whenever they encounter our AI-generated images.”
ImageFX is currently only available through Google’s AI Test Kitchen alongside MusicFX, TextFX and Generative AI Search. It is also restricted to the U.S., Kenya, New Zealand, and Australia and only in English.
However, you can experiment with Imagen 2, the model powering ImageFX in Google Bard, Duet AI in Workspace and Google Cloud’s Vertex AI more widely.