The machine is better than 99.8 percent of players on Battle.net, ‘under professionally approved conditions.’
The rise of the machines drew one step closer to reality today, as researchers announced that the Google DeepMind-powered AlphaStar AI has now achieved Grandmaster ranking with all three races in StarCraft 2. AlphaStar dominated pro StarCraft 2 players in January, but only as a Protoss player, and under more favorable conditions. Now, researchers say the AI can play at the Grandmaster level “under professionally approved conditions” and with the same constraints as human players, including viewing the game through a camera and with even tighter restrictions on the frequency of its actions.
“Many real-world applications require artificial agents to compete and coordinate with other agents in complex environments. As a stepping stone to this goal, the domain of StarCraft has emerged as an important challenge for artificial intelligence research, owing to its iconic and enduring status among the most difficult professional esports and its relevance to the real world in terms of its raw complexity and multi-agent challenges,” the Nature research abstract explains.
“Over the course of a decade and numerous competitions, the strongest agents have simplified important aspects of the game, utilized superhuman capabilities, or employed hand-crafted sub-systems. Despite these advantages, no previous agent has come close to matching the overall skill of top StarCraft players. We chose to address the challenge of StarCraft using general-purpose learning methods that are in principle applicable to other complex domains: a multi-agent reinforcement learning algorithm that uses data from both human and agent games within a diverse league of continually adapting strategies and counter-strategies, each represented by deep neural networks. We evaluated our agent, AlphaStar, in the full game of StarCraft 2, through a series of online games against human players. AlphaStar was rated at Grandmaster level for all three StarCraft races and above 99.8 percent of officially ranked human players.”
The researchers explained in a DeepMind blog post that they used a mix of “general-purpose machine learning techniques” to train AlphaStar, including neural networks, self-play via reinforcement learning, multi-agent learning, and imitation learning, each with their own inherent strengths and weaknesses. One particularly interesting step in that process was the development of a group of self-play “agents” called the League. Self-play agents typically do their utmost to win at all times, but that’s not necessarily the best way to teach or learn.
“In the real world, a player trying to improve at StarCraft may choose to do so by partnering with friends so that they can train particular strategies. As such, their training partners are not playing to win against every possible opponent, but are instead exposing the flaws of their friend, to help them become a better and more robust player,” the research team explained.
“The key insight of the League is that playing to win is insufficient: instead, we need both main agents whose goal is to win versus everyone, and also exploiter agents that ‘take one for the team’, focusing on helping the main agent grow stronger by exposing its flaws, rather than maximizing their own win rate. Using this training method, the current League learns all its complex StarCraft II strategy in an end-to-end-fashion—as opposed to the earlier incarnation of our work, which stitched together agents produced by a variety of methods and algorithms.”
Reinforcement learning also played a major part in developing the AI. The “enormous action space” of StarCraft 2 rendered many existing reinforcement learning techniques ineffective, but “AlphaStar uses a new algorithm for off-policy reinforcement learning, which allows it to efficiently update its policy from games played by an older policy.”
I will admit that most of the technical verbiage goes way over my head—the first thing I think about when I see the word “agent” is a guy talking into his shoe—but the net result of all this expended brainpower is a kickass StarCraft 2 player. StarCraft 2 pro Dario “TLO” Wünsch, who’s been working on the project, said AlphaStar’s gameplay is “incredibly impressive,” but not “superhuman—certainly not on a level that a human couldn’t theoretically achieve. Overall, it feels very fair – like it is playing a ‘real’ game of StarCraft.”
But of course it’s not just about StarCraft 2: AlphaStar’s ascendance represents a significant step forward in AI research that could have repercussions that go far beyond videogames.
“Ultimately, these results provide strong evidence that general-purpose learning techniques can scale AI systems to work in complex, dynamic environments involving multiple actors,” the researchers wrote. “The techniques we used to develop AlphaStar will help further the safety and robustness of AI systems in general, and, we hope, may serve to advance our research in real-world domains.”