AI image generators have advanced so fast in the past year that you might think we couldn't be surprised any more. But whenever it seems text-to-image AI might have reached its zenith, along comes another model that takes a further leap forward.
Google’s announced that it's releasing Imagen 2, the second generation of its generative AI image generator and editor. And based on initial example images, it looks like a stunning upgrade and might deserve a place in our pick of the best AI image generators.
Meet Imagen 2: our most advanced text-to-image diffusion technology. ✨It features high-quality, photorealistic outputs and stronger consistency with your prompts. 🖼Now available to use via @GoogleCloud’s #VertexAI platform. → https://t.co/T1IIJMbIW9 pic.twitter.com/iWIzi2jgZHDecember 13, 2023
Despite first announcing the update in May, until now Google hadn't shared any samples from the model. Now it's opening access to approved Google Cloud customers who use Vertex AI, and it's finally offered a glimpse of the tool's capabilities.
Google says that Imagen 2 from its AI lab DeepMind offers “significantly” improved image quality and can interpret longer, more descriptive prompts thanks to its “novel training and modeling techniques”. It says the model also has new capabilities to generate text and abstract logos, and to overlay letters and logos on existing images for anything from advertising to the design of merchandise and business cards.
The ability to apply text into image generations brings Imagen in line with OpenAI's DALL-E 3 and Amazon's new Titan AI image generator. But Imagen 2 goes further. Google says it can handle text in Chinese, Hindi, Japanese, Korean, Portuguese and Spanish as well as English, with more languages to be added next year.
I should note that we've seen no evidence of this yet, but the claim is that a text prompt can request the inclusion of text a language other than that in which the prompt is written. That makes it sound like we'll soon be seeing a whole lot of AI-generated images with random mistranslated text. However, dubious translation abilities aside, the image and text capabilities of Imagen 2 do look impressive.
Several observers have already shared comparisons in which they tested the same prompts used in Google's announcement about Imagen 2 in other AI image generators, including Midjourney, DALL-E 3 and Imagine from Meta AI. The images generated by Imagen 2 do seem to have more realistic lighting and shadows in cases where the results are in a photorealistic style.
Google Deepmind launches Imagen 2.I tried their prompts on Dalle3, Midjourney, and Imagine with Meta AI.6 examples.1/6Prompt: A shot of a 32-year-old female, up and coming conservationist in a jungle; athletic with short, curly hair and a warm smileImagen 2 is not… pic.twitter.com/P0AvdxZKXLDecember 14, 2023
Universal Prompt:A cup of strawberry yogurt with the word "Delicious" written on its side, sitting on a wooden tabletop. Next to the cup of yogurt is a plate with toast and a glass of orange juice pic.twitter.com/7JQGNrc4IYDecember 14, 2023
Google hasn't explained what data was used to train Imagen 2 (the first iteration was controversially trained on a version of the LAION public dataset). It's also been notably quiet about its lack of an opt-out system for creators despite offering indemnification to protect eligible Vertex AI users from copyright claims. The one gesture towards responsibility with the new release is the use of DeepMind's SynthID to add invisible watermarks to identify images generated by Imagen.
To learn how AI image generators work, see our roundup of the best AI art tutorials.