Almost a decade ago, the viral sensation "Twitch Plays Pokémon" brought together over a million players who simultaneously guided Pokémon Red's single pixelated character using their individual keystrokes. Today, as technology evolves like a Magikarp transforming into a Gyarados, a new question arises: can AI conquer the world of Pokémon?
Seattle-based software engineer Peter Whidden has been dedicating the past few years to training a reinforcement learning algorithm to master the original Pokémon game. During this time, his AI has logged more than 50,000 hours of gameplay. Whidden recently shared a captivating 33-minute YouTube video chronicling the AI's journey, which quickly garnered 2.2 million views within just nine days.
“What’s been incredibly exciting is the level of engagement it has sparked,” Whidden shared. He has made the code public on GitHub, complete with detailed instructions for others to replicate and train their AI. “Many people seem genuinely interested in the process of creating and designing their own AI.” One enthusiastic developer successfully adapted his code to work with Pokémon Crystal, another classic Game Boy title.
The AI employs a Pavlovian reinforcement model, receiving point-based rewards for leveling up Pokémon, exploring new territories, winning battles, and defeating gym leaders. Interestingly, its goals don't always align perfectly with game progression. These quirky failures have added a charming aspect to the project, likely contributing to the video's viral success.
In one comical display, the AI becomes transfixed by the water in Pallet Town—the game's initial area—and fails to progress. Stuck amidst vibrant animated water, grass, and wandering NPCs, the AI experiences each frame as if it were a new discovery, even while remaining motionless without a single captured Pokémon. This AI isn't in a hurry to "catch 'em all"; perhaps it’s simply savoring the ecological beauty of the Kanto region—or maybe it's pondering the moral implications of battling charming creatures.
“As it turns out, for our AI, just admiring the scenery is more rewarding than exploring the rest of the world,” Whidden articulates in his video. “This reflects a paradox we often face in real life: while curiosity drives significant discoveries, it also exposes us to distractions and challenges.”
The AI continues to resonate with viewers on an emotional level; later in its journey, it encounters a traumatic experience at the Pokémon Center. The AI's success rate partially hinges on the cumulative levels of its Pokémon party. However, during a mishap involving excessive button pressing, it unwittingly deposits a Pokémon into storage, resulting in a drastic drop in total levels. Initially composed of a Pidgey and a mysterious creature dubbed “AAAAAAAAAA,” the team totaled 25 levels, but after placing Pidgey in the PC, it plummeted to just 12.
“Although the AI lacks human emotions, a single event with significant rewards can profoundly influence its behavior,” Whidden explains. “In this instance, losing a Pokémon just once creates a lasting negative connection with the Pokémon Center, leading the AI to avoid it in future sessions.”
Despite the AI’s capacity for experiencing setbacks and finding beauty in Pallet Town, it remains fundamentally a computer. Initially, it struggled to interpret in-game text, getting stuck at critical junctures. For example, after receiving an item in the second town of Pokémon Red to return to Professor Oak in Pallet Town, the AI had difficulty making the backtrack. To address this, Whidden modified the program so each game starts after the delivery, assigning Squirtle as the starter Pokémon for an easier early-game experience.
“In the video, the farthest [the AI] reaches is Mt. Moon, nestled between the first and second gyms,” Whidden remarked. Caves are notoriously challenging to navigate, even for human players. Recently, Whidden adjusted the reward system and experimented with a new learning algorithm, enabling the AI to escape the cave and successfully reach Cerulean City.
Other researchers have leveraged reinforcement learning to explore AI in gaming, such as with DeepMind’s AlphaGo, which famously defeated a professional Go player. However, Whidden's tutorial has captured widespread interest due to his unique ability to demystify complex concepts through the beloved lens of Pokémon.