Google DeepMind’s new AI can follow commands inside 3D games it hasn’t seen before
Google DeepMind has unveiled new research highlighting an AI agent that's able to carry out a swath of tasks in 3D games it hasn't seen before. The team has long been experimenting with AI models that can win in the likes of Go and chess, and even learn games without being told their rules. Now, for the first time, according to DeepMind, an AI agent has shown it's able to understand a wide range of gaming worlds and carry out tasks within them based on natural-language instructions.
The researchers teamed up with studios and publishers such as Hello Games (No Man's Sky), Tuxedo Labs (Teardown) and Coffee Stain (Valheim and Goat Simulator 3) to train the Scalable Instructable Multiworld Agent (SIMA) on nine games. The team also used four research environments, including one built in Unity in which agents are instructed to form sculptures using building blocks. This gave SIMA, described as "a generalist AI agent for 3D virtual settings," a range of environments and settings to learn from, with a variety of graphics styles and perspectives (first- and third-person).
"Each game in SIMA’s portfolio opens up a new interactive world, including a range of skills to learn, from simple navigation and menu use, to mining resources, flying a spaceship or crafting a helmet," the researchers wrote in a blog post. Learning to follow directions for such tasks in video game worlds could lead to more useful AI agents in any environment, they noted.
The researchers recorded humans playing the games and noted the keyboard and mouse inputs used to carry out actions. They used this information to train SIMA, which has "precise image-language mapping and a video model that predicts what will happen next on-screen." The AI is able to comprehend a range of environments and carry out tasks to accomplish a certain goal.
The researchers say SIMA doesn't need a game's source code or API access — it works on commercial versions of a game. It also needs just two inputs: what's shown on screen and directions from the user. Since it uses the same keyboard and mouse input method as a human, DeepMind claims SIMA can operate in nearly any virtual environment.
The agent is evaluated on hundreds of basic skills that can be carried out within 10 seconds or so across several categories, including navigation ("turn right"), object interaction ("pick up mushrooms") and menu-based tasks, such as opening a map or crafting an item. Eventually, DeepMind hopes to be able to order agents to carry out more complex and multi-stage tasks based on natural-language prompts, such as "find resources and build a camp."
In terms of performance, SIMA fared well based on a number of training criteria. The researchers trained the agent in one game (let's say Goat Simulator 3, for the sake of clarity) and got it to play that same title, using that as a baseline for performance. A SIMA agent that was trained on all nine games performed far better than an agent that trained on just Goat Simulator 3.
What's especially interesting is that a version of SIMA that was trained in the eight other games then played the other one performed nearly as well on average as an agent that trained just on the latter. "This ability to function in brand new environments highlights SIMA’s ability to generalize beyond its training," DeepMind said. "This is a promising initial result, however more research is required for SIMA to perform at human levels in both seen and unseen games."
For SIMA to be truly successful, though, language input is required. In tests where an agent wasn't provided with language training or instructions, it (for instance) carried out the common action of gathering resources instead of walking where it was told to. In such cases, SIMA "behaves in an appropriate but aimless manner," the researchers said. So, it's not just us mere mortals. Artificial intelligence models sometimes need a little nudge to get a job done properly too.
DeepMind notes that this is early-stage research and that the results "show the potential to develop a new wave of generalist, language-driven AI agents." The team expects the AI to become more versatile and generalizable as it's exposed to more training environments. The researchers hope future versions of the agent will improve on SIMA's understanding and its ability to carry out more complex tasks. "Ultimately, our research is building towards more general AI systems and agents that can understand and safely carry out a wide range of tasks in a way that is helpful to people online and in the real world," DeepMind said.
This article originally appeared on Engadget at https://www.engadget.com/google-deepminds-new-ai-can-follow-commands-inside-3d-games-it-hasnt-seen-before-140341369.html?src=rss
Google DeepMind has unveiled new research highlighting an AI agent that's able to carry out a swath of tasks in 3D games it hasn't seen before. The team has long been experimenting with AI models that can win in the likes of Go and chess, and even learn games without…
Recent Posts
- Amazon’s new Proteus warehouse robot is fully autonomous
- Let us filter AI slop, you cowards
- AI leaders call for tougher protections against AI-aided bioweapons
- 5 Best Smart Speakers (2026): Alexa, Google Assistant, Siri
- I’m an outdoors expert — here are 9 easy-pitch tents I’d recommend for a fuss-free camping trip
Archives
- June 2026
- May 2026
- April 2026
- March 2026
- February 2026
- January 2026
- December 2025
- November 2025
- October 2025
- September 2025
- August 2025
- July 2025
- June 2025
- May 2025
- April 2025
- March 2025
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023