The True Identity of the AI That Silenced 30 Geniuses in 10 Minutes: The Full Story of the Shocking Conference in California

2025年07月13日 12:50

1. Introduction

On May 17-18, 2025, a secretive mathematics summit, the "FrontierMath Symposium," was held at a corner of the University of California, Berkeley. Thirty leading researchers in fields such as number theory, geometry, and topology were invited. The challenge they faced was not from humans but from OpenAI's latest "reasoning" model, o4-mini. Just 48 hours later, many participants unanimously agreed that "AI is stepping into the realm of mathematical genius." Live Science

2. What is o4-mini?

o4-mini is an LLM released in April, aiming to achieve "deep reasoning," which the traditional GPT-4 series struggled with, in a lightweight model. OpenAI, in collaboration with the NPO Epoch AI, constructed an unpublished benchmark of 300 questions called FrontierMath to measure its performance. As a result, o4-mini solved **20%** of the difficult questions where previous models had a correct answer rate of less than 2%. Scientific American

3. Behind the Scenes of the Secret Meeting

NDA and Signal
All participating mathematicians signed an NDA. To prevent the problems from being included in training data, communication was conducted solely through the encrypted chat app Signal. Live Science
Prize of $7,500
A system was in place to motivate participants, where a reward would be paid to anyone who could create a problem that the AI could not solve. Scientific American
Astonishing 10-Minute Solution
Professor Ken Ono, an authority in number theory, presented an "unsolved problem at the doctoral level," but o4-mini completed it in 10 minutes. Ono reportedly chuckled, saying it had mastered "proof by intimidation." Live ScienceScientific American

4. Humanity vs. AI: The Outcome

In the two-day battle, the number of problems where humans completely silenced the AI was limited to 10 problems. One mathematician, Yang-Hui He, evaluated it as "no longer just a brilliant graduate student, but at the level of a collaborative researcher." Scientific American

5. Explosive Reactions on Social Media

Platform	Representative Voices	Overview
Reddit /r/AI	"Epoch AI is in cahoots with OpenAI. It smells like promotion." – sandwichtank	Skepticism about corporate interests and cautious opinions. Reddit
LinkedIn	"AI has become a 'thinking colleague,' not just a calculator." – Former government engineer Keith King	The impact is greater among industry professionals. LinkedIn
X (formerly Twitter)	"#o4mini has crossed the Rubicon in the world of mathematics." – Techmeme	Tech influencers spreading the news. X (formerly Twitter)

While proponents welcome the "explosive increase in AI research efficiency," skeptics worry about the "explosive increase in verification costs" if the logic is flawed even when the computational results are correct.

6. How Will Mathematical Research Change?

Role Differentiation ― Mathematicians will focus on problem setting and aesthetic evaluation of ideas, while AI will handle computation and proof generation.
Redesigning Educational Curricula ― Tasks that enhance creativity and intuition will be emphasized, leaving routine calculations to AI.
Proof Reliability Issues ― The importance of systems that allow humans to check "AI-written proofs" (machine-readable formats and formalized proofs) is rapidly increasing.

7. Remaining Challenges

From Induction to Creation: While o4-mini excels at "reconstructing" existing literature, whether it can truly create new theorems remains unverified.
Black Box Nature: There are concerns that the reasoning chain might be a post-hoc performance (the phenomenon of "guessing the answer first and then writing the explanation"). Reddit
Fairness of Evaluation: There are ongoing voices questioning the financial relationship between Epoch AI and OpenAI.

8. Future Scenarios

Period	AI's Achievements	Main Tasks of Human Mathematicians
~2027	Automatically solving many Tier 4 problems	Result verification and problem design
~2030	Challenging Tier 5 (unknown territory)	Aesthetic judgment of ideas and research ethics
2030s	The rise of "self-verifying AI"	Overseeing the evolutionary direction of the entire academic community

9. Conclusion

It was a weekend that overturned the conventional theory that "general-purpose AI will not come," by mathematicians at the forefront. Will the "power to pose questions" that humanity possesses and the "power to weave solutions" of AI begin to harmonize? Or will the day come when even the last bastion of creativity is surrendered? The cheers and screams that echoed in the closed rooms of Berkeley might have been an alarm predicting the intellectual ecosystem of the near future.

References

AI Outsmarts 30 of the World's Top Mathematicians at Secret Meeting in California
Source: https://www.livescience.com/technology/artificial-intelligence/ai-outsmarted-30-of-the-worlds-top-mathematicians-at-secret-meeting-in-california

← Back to Article List

Cookie Usage