Google's AI chatbot Bard is finally getting the image generation capabilities it needs to stay competitive with rival Large Language Models (LLMs) like Copilot/Bing and ChatGPT 4. The feature is currently live in the US, with other regions to follow.
Bard has already been on the end of a recent upgrade, now running on Google's powerful Gemini Pro LLM, but will now also include the Imagen 2 text-to-image model to generate images for users.
Google has been playing catch-up to the competition ever since ChatGPT exploded onto the scene last year, and the addition of image generation is a big stride in the right direction.
While Google Bard is unlikely to dethrone ChatGPT as the world's most popular chatbot, it does stand a compelling chance against search engine rival Microsoft, who's Copilot AI has successfully managed to sway users away from Chrome and to the Edge Browser since its initial launch as Bing Chat.
With both AI chatbots offering similar search capabilities and image generation, we thought we would compare to two to see just how far Google Bard has come along. So without further ado, let's see how Bard's Imagen 2 software matches up against Copilot's DALL·E 3-powered alternative.
Imagen 2 vs. DALL·E 3: Generating hands
There's often one key giveaway that a picture or photo has been generated by AI: hands. Much like most human artists, AI seems to crumble to pieces the moment it's tasked with drawing the human hand — instead resorting to weird hot dog-like appendages that defy all conventional wisdom of human biology.
So how do the two compare when it comes to generating images from simple or complex prompts involving the human hand? Let's find out.
Imagen 2 vs. DALL·E 3: Generating text
Another keen giveaway that an image has been AI-generated is the fact that the primary language of almost all image generators is Simlish mixed with the melted English found in Captcha checks.
AI image generators are getting better at this, especially when prompted to use actual words or phrases, but their ability to naturally embed context-fitting language into images is still very much hit-or-miss. Let's see how Bard's image generation holds up against Copilot on the text front.
Imagen 2 vs. DALL·E 3: Generating tools
LLMs know what tools are. They can even tell you how to use them. Ask one to make an image of said tool in operation, however, and you're likely to see scenes so ridiculous you would presume it to be a still from a late-night infomercial designed to make you think the hammer is a tool too complex for the average person to grasp the operation of.
That's how it's been in the past, at least. Though, while image generators have improved their grasp on the concept of tools and handiwork in action, it's not quite perfected the art of digitally depicting DIY. Or has it? Let's find out.
Conclusion
It's plain to see from the above images that Google Bard's Imagen 2 generation abilities offer some impressively photorealistic results. While its results can still offer muddied hands and lack the logic of a genuine scenario, the images typically 'feel' more real.
In contrast, Copilot's DALL·E 3-powered image generation offers overly softened and smooth images that feel slightly dream-like or aspirational. The colors are more vivid, the lighting more dramatic, and everything has a very 'rendered' feel to it.
Google Bard also has the leg up on resolution, with its generated images sitting at a resolution of 1532 x 1532 compared to DALL·E 3's 1024 x 1024 limitation.
Both Imagen 2 and DALL·E 3 have some impressive qualities to the images they produce, with each being able to replicate various styles. Both can also be used to generate non-photo image results such as line drawings, info-graphics, comic strips, and more.
It's hard to definitively say which image generator is the best, as this will mostly come down to how you plan to use each piece of software. However, for photorealistic results, I'm very impressed with what Bard has to offer. Plus, Bard is noticeably faster at generating images compared to Copilot's DALL·E 3, being able to churn out higher-resolution images much more promptly.
For that, I'd have to conclude that Google has the edge on this one, at least for now.