A new artificial intelligence (AI) tool can generate images in under two seconds — and it doesn't need expensive hardware to run.
South Korean scientists have used a special technique called knowledge distillation to compress the size of an open source (or publicly available) image generation model known as Stable Diffusion XL — which has 2.56 billion parameters, or variables the AI uses to learn during training.
The smallest version of the new model, known as "KOALA", has just 700 million parameters, meaning it's lean enough to run quickly and without needing expensive and energy-intensive hardware.
The method they used, knowledge distillation, transfers knowledge from a large model to a smaller one, ideally without compromising performance. The benefit of a smaller model is that it takes less time to perform computations and generate an answer.
The tool can run on low-cost graphics processing units (GPUs) and needs roughly 8GB of RAM to process requests — versus larger models, which need high-end industrial GPUs.
The team published their findings in a paper Dec. 7, 2023 to the preprint database arXiv. They have also made their work available via the open source AI repository Hugging Face.
The Electronics and Telecommunication Research Institute (ETRI), the institution behind the new models, has created five versions including three versions of the "KOALA" image generator — which generates images based on text input — and two versions of "Ko-LLaVA" — which can answer text-based questions with images or video.
When they tested KOALA, it generated an image based on the prompt "a picture of an astronaut reading a book under the moon on Mars" in 1.6 seconds. OpenAI's DALL·E 2 generated an image based on the same prompt in 12.3 seconds, and DALL·E 3 generated it in 13.7 seconds, according to a statement.
The scientists now plan to integrate the technology they've developed into existing image generation services, education services, content production and other lines of business.