At Google I/O 2024, Google introduced two new generative media models in Veo and Imagen 3. Veo is built to generate high-def video while Imagen 3 is a text-to-image model.
Google has been steadily updating Imagen 2 through the early part of 2024. It recently gained the ability to create ‘live photos’ and we consider it one of the best AI image generators.
Veo is the newest generative media model from Google and is specifically geared toward generating 1080p videos. Google says Veo can make videos longer than a minute, but did not say how much more than a minute.
Veo is supposed to understand cinematic terms like timelapse and “aerial shots of a landscape”. The tech giant showed some of this off in a collaboration video with Donald Glover.
Veo is available to select users inside VideoFX, and there is a waitlist if interested in checking it out.
According to Google, the updated Imagen 3 has been upgraded to better understand “natural language, the intent behind your prompt and incorporates small details from longer prompts.”
The company is also claiming that it’s the best model for rendering text so far, an ongoing issue with most image generation AI models. If true, gone are the days of seeing weird misspellings and or lorem ipsum-esque “words” in images.
Imagen 3 is available for some creators as a preview in ImageFX. For those curious, Google has a waitlist to join. Google did not specify when but Imagen 3 will apparently be available in Vertex AI soon.
In the press release, Google briefly mentioned that there is a suite of AI tools coming out called Music AI Sandbox. These tools are supposed to allow users to create new instrumental sections and transform sound. Beyond noting partnerships with Wyclef Jean, Marc Rebillet and Justin Tranter, Google did not elaborate on specifics for the AI tools.