Is ChatGPT-5 Really Disappointing? ── A Comprehensive Review of What the Previous Model Couldn't Do and Where Other Companies Still Excel

2025年08月12日 23:11

1. First, Organizing the "Disappointment" Argument

From the moment of its announcement, there was a mix of opinions on social media and in the media. Opinions such as "users had overly high expectations" and "practicality has improved but it's not revolutionary" were prevalent. Axios pointed out reports of errors in math and geography, dissatisfaction with delays, and the gap with the "Ph.D. level" statement. Axios
This atmosphere is also due to the fact that OpenAI's shift towards model integration and a focus on safety and practicality was out of sync with the audience expecting a spectacular "leap." OpenAI

2. What GPT-5 Can Do That Previous Models Couldn't

2-1. Integrated System: Automatically Optimizing "Amount of Thinking" and Pathways

GPT-5 is an integration of a lightweight response model + deep reasoning model (GPT-5 Thinking/Pro) + real-time router. Depending on user instructions and the difficulty of the task, it variably switches from fast responses to deep thinking. This makes the experience of "quick tasks are fast, difficult problems are deeply thought out" the default. OpenAI

2-2. Significant Enhancements in Coding and Agent-like Tasks

In the developer announcement, SWE-bench Verified 74.9% and strength in long tool chains (including parallel) were demonstrated, and new parameters such as verbosity and reasoning_effort, which control the length and amount of thought in responses, and **"custom tools callable in plain text"** were also added. The ability to "finish the job" in practical work has improved. OpenAI

2-3. Chat Experience: Personality Presets and Voice Evolution
Fortune introduced customizable "personality" presets such as Cynic, Robot, Listener, Nerd and enhanced voice experience. Tone adjustments have become easier, making it easier to switch conversation styles according to the purpose. Fortune

2-4. Expansion in Practical Areas (Enterprise Use)
OpenAI emphasizes improvements in accuracy, speed, and reasoning in major tasks such as writing, creating, and researching. With automation and collaboration in mind for enterprise workflows, they are advocating for a **"new era of work."** OpenAI

3. Why the Discontent? ── The Gap in Initial Reactions
The point that it seemed like a **"large minor update"** compared to the expectations of dramatic leaps
.

Initial confusion and reports of issues regarding router behavior and some inaccuracies..

A user base sensitive to differences in **"temperature" and "empathy"** compared to older models (like 4o)..
..Axios

4. Points Where Other Companies Are Still Superior (By Application)
4-1. Deep Thinking Handled by Users: Anthropic Claude
Extended Thinking can be turned ON/OFF, and developers can also set a **"thinking budget."..Anthropic+1

Furthermore, memory updates allowing cross-search and reference of past dialogues are progressing (prioritized for Max/Team/Enterprise). Convenient for resuming long-term projects. The Verge

How to Differentiate Use:

In situations like math, science, and design reviews, where you want to intentionally increase "thinking time" to pursue accuracy.

Teams prioritizing safety and policy compliance..Anthropic

4-2. Research, Integration, Long Context: Google Gemini
2.0 Pro/Flash/Flash-Lite clearly differentiates between speed, cost, and capability.......blog.google

Deep Research and Canvas (workspace with code generation and preview), and the "thinking" enhancement of **2.5 Pro (experimental)** are also being developed...Geminiblog.google

How to Differentiate Use:

Research, planning, and documentation utilizing Google app integration
(YouTube/Maps/Drive, etc.).

Mass document analysis and long-term project management.

4-3. Self-Hosting/Custom Freedom: Meta Llama (Open Systems)
Llama 3.1 (up to 405B) is reported as "the most promising in open systems," and subsequently, Llama 3.2 expands vision support and edge optimization...The VergeAI Meta+1

How to Differentiate Use:

Operation in on-premises/specific regulatory environments, focusing on fine adjustments and optimization of inference costs.

Real-time processing on mobile/edge.

5. Conclusion ── "Disappointment" or "Steady Evolution"
GPT-5 enhances "smoothness in practical work" with integrated intelligence operation (amount of thinking, routing). The foundational strength in coding, agents, and instruction following has certainly grown. OpenAI+1

However, expecting a "dramatic leap" might lead to disappointment.......AnthropicGeminiThe Verge

Conclusion: GPT-5 has matured into a "tool to reliably advance daily work" as a large minor update.....

6. Quick Reference Guide for Differentiation (Key Points)
Enhancing Accuracy with Deep Thinking: Claude (Extended Thinking, Thinking Budget) Anthropic

OpenAI's New Revolution: How ChatGPT Agents Can Transform Your Business

The Day AI Seized the Gold Medal ─ Gemini Deep Think and the Future of Mathematics

Is AI Ultimately Driven by Advertising? What the Introduction of ChatGPT Ads Reveals About the "Reality of Consumer AI"

Does AI Dependence Diminish Intelligence or Liberate It? ─ The True Nature of "Cognitive Debt" Revealed by MIT

Revolution in Grading by AI? Changes in University Transcripts, ChatGPT Alters the "Reliability of Evaluations"

Is ChatGPT-5 Really Disappointing? ── A Comprehensive Review of What the Previous Model Couldn't Do and Where Other Companies Still Excel

1. First, Organizing the "Disappointment" Argument

2. What GPT-5 Can Do That Previous Models Couldn't

2-1. Integrated System: Automatically Optimizing "Amount of Thinking" and Pathways

2-2. Significant Enhancements in Coding and Agent-like Tasks

2-3. Chat Experience: Personality Presets and Voice Evolution

2-4. Expansion in Practical Areas (Enterprise Use)

3. Why the Discontent? ── The Gap in Initial Reactions

4. Points Where Other Companies Are Still Superior (By Application)

4-1. Deep Thinking Handled by Users: Anthropic Claude

4-2. Research, Integration, Long Context: Google Gemini

4-3. Self-Hosting/Custom Freedom: Meta Llama (Open Systems)

5. Conclusion ── "Disappointment" or "Steady Evolution"

6. Quick Reference Guide for Differentiation (Key Points)

Cookie Usage

1. First, Organizing the "Disappointment" Argument

2. What GPT-5 Can Do That Previous Models Couldn't

2-1. Integrated System: Automatically Optimizing "Amount of Thinking" and Pathways

2-2. Significant Enhancements in Coding and Agent-like Tasks

2-3. Chat Experience: Personality Presets and Voice Evolution

2-4. Expansion in Practical Areas (Enterprise Use)

3. Why the Discontent? ── The Gap in Initial Reactions

4. Points Where Other Companies Are Still Superior (By Application)

4-1. Deep Thinking Handled by Users: Anthropic Claude

4-2. Research, Integration, Long Context: Google Gemini

4-3. Self-Hosting/Custom Freedom: Meta Llama (Open Systems)

5. Conclusion ── "Disappointment" or "Steady Evolution"

6. Quick Reference Guide for Differentiation (Key Points)

OpenAI's New Revolution: How ChatGPT Agents Can Transform Your Business

The Day AI Seized the Gold Medal ─ Gemini Deep Think and the Future of Mathematics

Is AI Ultimately Driven by Advertising? What the Introduction of ChatGPT Ads Reveals About the "Reality of Consumer AI"

Does AI Dependence Diminish Intelligence or Liberate It? ─ The True Nature of "Cognitive Debt" Revealed by MIT

Revolution in Grading by AI? Changes in University Transcripts, ChatGPT Alters the "Reliability of Evaluations"