Photo editing could become the next area conquered by AI thanks to an exciting new tool unveiled by a group of researchers from Google.
Working with the Max Planck Institute of Informatics, they have created a point-based image manipulation tool called DragGAN. Essentially, it’s able to incrementally move multiple points of an image along a target trajectory defined by the user. The really clever part is AI keeps the output within the bounds of a realistic-looking image.
So in theory, without any prior image editing expertise, you could manipulate the dimensions of a vehicle or the expression of a face without it seeming distorted. And you could do it all with a cursor click.
At present, DragGAN is still only a research white paper. But such is the interest that incoming traffic repeatedly crashed the team’s homepage over the last couple of days.
“Existing approaches gain controllability of generative adversarial networks (GANs) via manually annotated training data or a prior 3D model, which often lack flexibility, precision, and generality,” the researchers wrote.
“In this work, we study a powerful yet much less explored way of controlling GANs, that is, to "drag" any points of the image to precisely reach target points in a user-interactive manner.”
GANs are still the king at latent space exploration.DragGAN looks amazing. pic.twitter.com/KT3AEtdBJKMay 19, 2023
Although existing photo editors allow you to quickly resize or rework images (like the “Warp” tool in Photoshop) it’s fundamentally a different process. Those tools are quite literally pulling the image one way or the other in response to input. But DragGAN is regenerating the entire underlying object to accommodate the changes you want to make.
What else can it do?
In a larger context, it could be used in conjunction with a text-to-image generative AI tool like Midjourney or Runway. If the output from your prompts isn’t quite what you want, you could use DragGAN to edit it faster and more efficiently than you could in a pro-level editing suite.
Some of the other examples explained in the research paper include changing the height of a mountain, moving the position of a model and resizing her clothes as well as opening or closing a lion’s mouth to indicate a roar. And where an element of the picture normally wouldn’t be available, AI can fill in the gaps.
“Our approach can hallucinate occluded content, like the teeth inside a lion’s mouth, and can deform following the object’s rigidity, like the bending of a horse leg,” the team added.
It’s not yet clear when the DragGAN tool will be released for mainstream use, however a note on the team’s Github page suggests the code will be made available in June 2023. In the meantime, here are the best 5 AI image generators you can use right now.