Google Gemini "panicked" while playing Pokémon
Google DeepMind’s Gemini 2.5 Pro model shows noticeable deterioration in action logic during a game of Pokémon when its Pokémon are on the verge of defeat, TechCrunch reports.
According to the study, in critical situations, Gemini 2.5 Pro stops using available tools, leading to "a noticeable deterioration in reasoning ability." This behavior resembles a human response to stress.
The model's gameplay is streamed live on the Twitch channel Gemini Plays Pokémon. There is also a similar channel from Anthropic's Claude. Both streams are accompanied by explanations of the decisions the AI makes during the game.
The model takes a lot of time to complete the game. What a child would complete in dozens of hours, Gemini 2.5 Pro does in hundreds. But sometimes the AI makes noticeable progress, for example, it was able to solve a spatial puzzle in the game on the first try based only on a text description of the physics of objects.
AI can also sometimes choose strange strategies to solve game situations. For example, the model Claude, stuck in a cave on Mt. Moon, intentionally "killed" all of her Pokémon, expecting it to teleport her to the other end of the cave.
Despite their many shortcomings, such experiments have research value. Games provide a safe and controlled environment to test models’ ability to adapt, plan, and think strategically in complex situations. Google also notes that Gemini can independently create agent-like tools to perform specific tasks — potentially without human intervention.