Get all your news in one place.

100's of premium titles.
One app.

Start reading

Get all your news in one place.

100's of premium titles. One news app.

Start reading

Tom’s Guide

Technology

Ryan Morrison

I put 7 leading AI image generators to the test with the same prompt — here’s the winner

Ideogram Beatles Big Ben (Person) DALLE Artificial Intelligence FREEPIK (Organization) London MidJourney

Image generation is one of the most mature forms of artificial intelligence creation able to turn a simple idea into a graphic photograph or image of any kind

Well, the underlying technology is fairly mature. There are still strong distinctions between one model and the next and even the way one company might deploy the same version of a model in a completely different way to another company.

In some areas, there’s a lot of convergence particularly around hyperrealistic human faces but in others, there are distinct differences especially in things like text rendering skin texture and prompt following

To get a better idea of how AI might handle fairly complex prompts I’ve given the same free requests to 7 of the leading AI image generators including DALL-E, Flux, Ideogram, Mystic, Phoenix, Midjourney and Haiper.

Creating the prompts

Are we entering the era where instead of paying an influencer to promote its products, a brand will just generate one with AI that matches their aesthetic?These are some experiments this morning using Flux and @runwayml Gen-3 Alpha. pic.twitter.com/7VvscImorwAugust 11, 2024

There are likely more models that I’ve excluded than included, including the incredibly powerful Imagen 3 from Google and Meta's Imagine AI. The reason for their exclusion is that they are not as widely available globally as the ones I included.

The three prompts are fairly distinct; the first causes for for a complex scene creation with elements in specific places, the second makes specific requirements for text rendering and the third focuses on skin texture and realism.

If you disagree with any of my decisions or want to try out prompts with specific settings (I ran them all using defaults) I've included the prompts in full.

Prompt one: The young woman

An ultra-realistic smartphone selfie of a young woman in her mid-20s. The photo has the characteristic sharpness and vivid color of a high-end smartphone camera, with slight motion blur on one edge. The image is taken in natural daylight, causing mild overexposure on one side of her face. She has shoulder-length curly hair with grown-out highlights, and wears minimal, everyday makeup with slightly smudged eyeliner. Her expression is a genuine, slightly lopsided smile with a hint of tiredness around her eyes. She's wearing a comfortable, well-worn graphic t-shirt with a faded band logo. A thin silver necklace is partially tangled in her hair near her collar. The background is a lived-in studio apartment, with a unmade bed and a small bookshelf visible. A houseplant with a few yellowing leaves sits on a windowsill behind her. There's a small coffee stain barely visible on the collar of her shirt.

Midjourney

I used all default settings for all of these prompts which unfortunately does a disservice to Midjourney, which is the most customizable of all AI image models. Here it missed some of the points of the prompt because of its default behavior to make things perfect. That said I think it created a brilliant depiction of the woman.

DALL-E

DALL-E is barely in the race when testing for prompts showing real people as it makes everyone look a little like a BRATZ doll.

Ideogram

Ideogram did a good job of following the 'imperfections' element of the prompt but overdid it on the motion blur — slightly. However, I think this is the most natural of all the images of people.

Freepik Mystic

I like the lighting from Mystic and the woman looks the most realistic. The prompt was followed well but there is a degree of uncanny valley. It also has the 'too perfect' issue of Midjourney.

Flux (using Grok)

Flux might be my favorite overall image. I don't think its the best in terms of prompt adherence or realistic depiction but it is good and looks generally more believable.

Leonardo Phoenix

I really did believe this one was a real photo. It captured the imperfections perfectly but the lighting is still slightly off and the framing is weird.

Haiper

Haiper did a good job but it didn't get the lighting right and the skin is too 'perfect'. Otherwise this is my favorite character generated out of the set.

Winner: Ideogram

Prompt two: Penny Lane

A bustling 1960s London street scene on a rainy afternoon. The street is lined with iconic red double-decker buses, black cabs, and people holding colourful umbrellas. A Beatles-inspired band performs on a street corner, with their instruments reflecting in the wet pavement. In the background, Big Ben is visible through a light fog. A neon sign above a small café reads 'Penny Lane' in glowing letters. On the right, a woman in a stylish 1960s dress is waiting for the bus, holding a newspaper with the headline 'Man Walks on Moon.' Raindrops are visibly falling, creating ripples in puddles, and the whole scene has a blend of nostalgia and realism.