What you need to know
- Google introduced its newest AI, SIMA, a "generalist agent" designed to help users complete tasks in a video game.
- SIMA is based on language, meaning it requires what the user sees on their screen and their instruction to complete tasks like gathering resources.
- DeepMind states that it has used nine games to train SIMA thus far, but more work remains before it can handle complex tasks and instructions.
Google DeepMind announces its newest project, a "generalist AI agent" that aims to assist users in carrying out tasks while playing a game.
According to DeepMind, the latest AI is called the Scalable Instructable Multiworld Agent, or "SIMA" for short. Google's AI-focused division states SIMA can "perceive and understand a variety of environments, then take actions to achieve an instructed goal."
To work, DeepMind adds that SIMA needs what the user sees and "natural-language" instructions provided by the user. The AI is said to use a user's standard keyboard and mouse inputs to move a user's character within the game world. By extension, the post adds that SIMA can "interact with any virtual environment."
SIMA has been tested on 600 basic skills, such as turning left, climbing ladders, opening a game's pause menu for settings, and more. Google states that SIMA can also perform "simple tasks" in 10 seconds. A few of these tasks involve driving a car in Goat Simulator 3 and telling SIMA to walk to your spaceship in No Man's Sky.
To get SIMA to where it is now, Google states it partnered with game developers like Hello Games, the makers of No Man's Sky, and Tuxedo Labs, who made Teardown. SIMA was also taught using an environment built in the Unity engine called "Construction Lab." Through this, SIMA learned how to manipulate objects and better understood the physical world inside a video game.
Google also leaned on real-life gamers as an initial approach to understanding how SIMA could work. They monitored them as a pair, with one player playing the game while the other fed them instructions on how to complete a task.
In total, Google's DeepMind trained SIMA using nine different games, all within their respective genre. During their study, the division discovered that an AI agent trained across multiple games delivered better results than a system trained on one. More research showed that SIMA's reliance on language is paramount to its performance.
Without user input, SIMA is said to behave "in an appropriate but aimless manner." DeepMind observed SIMA exploring and gathering resources but wouldn't follow the game's objective unless strictly told to do so.
SIMA is still in its infancy, but Google is seemingly confident in its "language-driven" capabilities to assist gamers. More importantly, more research is reportedly necessary as Google wants SIMA to understand "higher-level language instructions to achieve more complex goals."