Skip to main content
ukiyo journal - 日本と世界をつなぐ新しいニュースメディア Logo
  • All Articles
  • 🗒️ Register
  • 🔑 Login
    • 日本語
    • 中文
    • Español
    • Français
    • 한국어
    • Deutsch
    • ภาษาไทย
    • हिंदी
Cookie Usage

We use cookies to improve our services and optimize user experience. Privacy Policy and Cookie Policy for more information.

Cookie Settings

You can configure detailed settings for cookie usage.

Essential Cookies

Cookies necessary for basic site functionality. These cannot be disabled.

Analytics Cookies

Cookies used to analyze site usage and improve our services.

Marketing Cookies

Cookies used to display personalized advertisements.

Functional Cookies

Cookies that provide functionality such as user settings and language selection.

The True Identity of the AI That Silenced 30 Geniuses in 10 Minutes: The Full Story of the Shocking Conference in California

The True Identity of the AI That Silenced 30 Geniuses in 10 Minutes: The Full Story of the Shocking Conference in California

2025年07月13日 12:50

1. Introduction

On May 17-18, 2025, a secretive mathematics summit, the "FrontierMath Symposium," was held at a corner of the University of California, Berkeley. Thirty leading researchers in fields such as number theory, geometry, and topology were invited. The challenge they faced was not from humans but from OpenAI's latest "reasoning" model, o4-mini. Just 48 hours later, many participants unanimously agreed that "AI is stepping into the realm of mathematical genius." Live Science


2. What is o4-mini?

o4-mini is an LLM released in April, aiming to achieve "deep reasoning," which the traditional GPT-4 series struggled with, in a lightweight model. OpenAI, in collaboration with the NPO Epoch AI, constructed an unpublished benchmark of 300 questions called FrontierMath to measure its performance. As a result, o4-mini solved **20%** of the difficult questions where previous models had a correct answer rate of less than 2%. Scientific American


3. Behind the Scenes of the Secret Meeting

  • NDA and Signal
    All participating mathematicians signed an NDA. To prevent the problems from being included in training data, communication was conducted solely through the encrypted chat app Signal. Live Science

  • Prize of $7,500
    A system was in place to motivate participants, where a reward would be paid to anyone who could create a problem that the AI could not solve. Scientific American

  • Astonishing 10-Minute Solution
    Professor Ken Ono, an authority in number theory, presented an "unsolved problem at the doctoral level," but o4-mini completed it in 10 minutes. Ono reportedly chuckled, saying it had mastered "proof by intimidation." Live ScienceScientific American


4. Humanity vs. AI: The Outcome

In the two-day battle, the number of problems where humans completely silenced the AI was limited to 10 problems. One mathematician, Yang-Hui He, evaluated it as "no longer just a brilliant graduate student, but at the level of a collaborative researcher." Scientific American


5. Explosive Reactions on Social Media

PlatformRepresentative VoicesOverview
Reddit /r/AI"Epoch AI is in cahoots with OpenAI. It smells like promotion." – sandwichtankSkepticism about corporate interests and cautious opinions. Reddit
LinkedIn"AI has become a 'thinking colleague,' not just a calculator." – Former government engineer Keith KingThe impact is greater among industry professionals. LinkedIn
X (formerly Twitter)"#o4mini has crossed the Rubicon in the world of mathematics." – TechmemeTech influencers spreading the news. X (formerly Twitter)

 



While proponents welcome the "explosive increase in AI research efficiency," skeptics worry about the "explosive increase in verification costs" if the logic is flawed even when the computational results are correct.


6. How Will Mathematical Research Change?

  1. Role Differentiation ― Mathematicians will focus on problem setting and aesthetic evaluation of ideas, while AI will handle computation and proof generation.

  2. Redesigning Educational Curricula ― Tasks that enhance creativity and intuition will be emphasized, leaving routine calculations to AI.

  3. Proof Reliability Issues ― The importance of systems that allow humans to check "AI-written proofs" (machine-readable formats and formalized proofs) is rapidly increasing.


7. Remaining Challenges

  • From Induction to Creation: While o4-mini excels at "reconstructing" existing literature, whether it can truly create new theorems remains unverified.

  • Black Box Nature: There are concerns that the reasoning chain might be a post-hoc performance (the phenomenon of "guessing the answer first and then writing the explanation"). Reddit

  • Fairness of Evaluation: There are ongoing voices questioning the financial relationship between Epoch AI and OpenAI.


8. Future Scenarios

PeriodAI's AchievementsMain Tasks of Human Mathematicians
~2027Automatically solving many Tier 4 problemsResult verification and problem design
~2030Challenging Tier 5 (unknown territory)Aesthetic judgment of ideas and research ethics
2030sThe rise of "self-verifying AI"Overseeing the evolutionary direction of the entire academic community


9. Conclusion

It was a weekend that overturned the conventional theory that "general-purpose AI will not come," by mathematicians at the forefront. Will the "power to pose questions" that humanity possesses and the "power to weave solutions" of AI begin to harmonize? Or will the day come when even the last bastion of creativity is surrendered? The cheers and screams that echoed in the closed rooms of Berkeley might have been an alarm predicting the intellectual ecosystem of the near future.



References

AI Outsmarts 30 of the World's Top Mathematicians at Secret Meeting in California
Source: https://www.livescience.com/technology/artificial-intelligence/ai-outsmarted-30-of-the-worlds-top-mathematicians-at-secret-meeting-in-california

← Back to Article List

Contact |  Terms of Service |  Privacy Policy |  Cookie Policy |  Cookie Settings

© Copyright ukiyo journal - 日本と世界をつなぐ新しいニュースメディア All rights reserved.