Google AI Misses IMO Gold by a Hair: Outperforms Human Competitors in 19-Second Challenge

Recently, Google's DeepMind achieved a remarkable milestone with its latest mathematical model, earning a silver medal at the International Mathematical Olympiad (IMO). The AI not only solved four out of six problems, achieving a perfect score on each, but also came within just one point of a gold medal. Notably, it completed the fourth problem in an astounding 19 seconds, impressing the human judges with both the quality and speed of its solutions.

In detail, DeepMind's AI systems, AlphaProof and AlphaGeometry 2, tackled the competition's real problems, securing top marks and showcasing extraordinary mathematical reasoning capabilities. This year's IMO featured six problems across algebra, combinatorics, geometry, and number theory. Among 609 participants, only 58 earned gold medals, with human competitors submitting answers in two timed sessions, each lasting 4.5 hours. Interestingly, while the AI solved one problem in mere minutes, it took three days to crack the remaining issues, illustrating its complexity.

AlphaProof, a novel system leveraging reinforcement learning for formal mathematical reasoning, and AlphaGeometry 2, an advanced geometry problem-solving system, are the duo behind this achievement. These systems have shattered previous limitations associated with AI in mathematics, where reasoning abilities and training datasets were often insufficient.

The contributions of renowned mathematicians like Professor Timothy Gowers and Dr. Joseph Myers were instrumental in evaluating the AI's performance. AlphaProof correctly solved two algebra problems and one number theory challenge, while AlphaGeometry 2 successfully addressed a geometry question. The only unsolved problems were two in combinatorial mathematics.

AlphaProof employs a formal language called Lean for proving mathematical propositions, integrating a pre-trained language model with the AlphaZero reinforcement learning algorithm, which has mastered games such as chess and Go. While human-generated data is limited, this system bridges the gap by translating natural language problems into formal statements, enriching its understanding and application of mathematical concepts.

Meanwhile, AlphaGeometry 2 is a hybrid neural-symbolic system grounded in the Gemini language model, trained from the ground up with a significantly larger dataset than its predecessor. This enables it to tackle complex geometric challenges involving motion, angles, ratios, and distance equations. Its rapid problem-solving ability was showcased at the IMO, where it solved one problem in just 19 seconds after receiving the formal question.

With these advancements, AI continues to push the boundaries of what is possible in mathematics, captivating the academic community and inspiring future developments in the field.

Most people like

Find AI tools in YBX