DeepMind’s Agent57 AI agent can best human players across a suite of 57 Atari games


Development of artificial intelligence agents tends to frequently be measured by their performance in games, but there’s a good reason for that: Games tend to offer a wide proficiency curve, in terms of being relatively simple to grasp the basics, but difficult to master, and they almost always have a built-in scoring system to evaluate performance. DeepMind’s agents have tackled board game Go, as well as real-time strategy video game StarCraft – but the Alphabet company’s most recent feat is Agent57, a learning agent that can beat the average human on each of 57 Atari games with a wide range of difficulty, characteristics and gameplay styles.
Being better than humans at 57 Atari games may seem like an odd benchmark against which to measure the performance of a deep learning agent, but it’s actually a standard that goes all the way back to 2012, with a selection of Atari classics including Pitfall, Solaris, Montezuma’s Revenge and many others. Taken together, these games represent a broad range of difficulty levels, as well as requiring a range of different strategies in order to achieve success.
That’s a great type of challenge for creating a deep learning agent because the goal is not to build something that can determine one effective strategy that maximizes your chances of success every time you play a game – instead, the reason researchers build these agents and set them to these tasks at all is to develop something that can learn across multiple and shifting scenarios and conditions, with the long-term aim of building a learning agent that approaches general AI – or AI that is more human in terms of being able to apply its intelligence to any problem put before it, including challenges it’s never encountered before.
DeepMind’s Agent57 is remarkable because it performs better than human players on each of the 57 games in the Atari57 set – previous agents have been able to be better than human players on average – but that’s because they were extremely good at some of the simpler games that basically just worked via a simple action-reward loop, but terrible at games that required more advanced play, including long-term exploration and memory, like Montezuma’s Revenge.
The DeepMind team addressed this by building a distributed agent with different computers tackling different aspects of the problem, with some tuned to focus on novelty rewards (encountering things they haven’t encountered before), with both short- and long-term time horizons for when the novelty value resets. Others sought out more simple exploits, figuring out which repeated pattern provided the biggest reward, and then all the results are combined and managed by an agent equipped with a meta-controller that allows it to weight the costs and benefits of different approaches based on which game it encounters.
In the end, Agent57 is an accomplishment, but the team says it can stand to be improved in a few different ways. First, it’s incredibly computationally expensive to run, so they will seek to streamline that. Second, it’s actually not as good at some of the simpler games as some simpler agents – even though it excels at the the top 5 games in terms of challenge to previous intelligent agents. The team says it has ideas for how to make it even better at the simpler games that other, less sophisticated agents, are even better at.
[embedded content]
Development of artificial intelligence agents tends to frequently be measured by their performance in games, but there’s a good reason for that: Games tend to offer a wide proficiency curve, in terms of being relatively simple to grasp the basics, but difficult to master, and they almost always have a…
Recent Posts
- Elon Musk says Grok 2 is going open source as he rolls out Grok 3 for Premium+ X subscribers only
- FTC Chair praises Justice Thomas as ‘the most important judge of the last 100 years’ for Black History Month
- HP acquires Humane AI assets and the AI pin will suffer a humane death
- HP acquires Humane AI assets and the AI pin may suffer a humane death
- HP acquires Humane Ai and gives the AI pin a humane death
Archives
- February 2025
- January 2025
- December 2024
- November 2024
- October 2024
- September 2024
- August 2024
- July 2024
- June 2024
- May 2024
- April 2024
- March 2024
- February 2024
- January 2024
- December 2023
- November 2023
- October 2023
- September 2023
- August 2023
- July 2023
- June 2023
- May 2023
- April 2023
- March 2023
- February 2023
- January 2023
- December 2022
- November 2022
- October 2022
- September 2022
- August 2022
- July 2022
- June 2022
- May 2022
- April 2022
- March 2022
- February 2022
- January 2022
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- June 2021
- May 2021
- April 2021
- March 2021
- February 2021
- January 2021
- December 2020
- November 2020
- October 2020
- September 2020
- August 2020
- July 2020
- June 2020
- May 2020
- April 2020
- March 2020
- February 2020
- January 2020
- December 2019
- November 2019
- September 2018
- October 2017
- December 2011
- August 2010