Skip to main content
ukiyo journal - 日本と世界をつなぐ新しいニュースメディア Logo
  • All Articles
  • 🗒️ Register
  • 🔑 Login
    • 日本語
    • 中文
    • Español
    • Français
    • 한국어
    • Deutsch
    • ภาษาไทย
    • हिंदी
Cookie Usage

We use cookies to improve our services and optimize user experience. Privacy Policy and Cookie Policy for more information.

Cookie Settings

You can configure detailed settings for cookie usage.

Essential Cookies

Cookies necessary for basic site functionality. These cannot be disabled.

Analytics Cookies

Cookies used to analyze site usage and improve our services.

Marketing Cookies

Cookies used to display personalized advertisements.

Functional Cookies

Cookies that provide functionality such as user settings and language selection.

In 2025, AI shifts from "speaking" to "acting" — The truth of the first year of AI agents and the homework for 2026

In 2025, AI shifts from "speaking" to "acting" — The truth of the first year of AI agents and the homework for 2026

2026年01月06日 00:35

In 2025, "AI Agents" Transition from "Concept" to "Infrastructure"—And in 2026, the Challenges We Face

The year 2025 marked a turning point where generative AI evolved from being a "handy tool for crafting text" to an entity capable of using external tools, progressing through multiple steps, and completing tasks with a certain degree of "autonomy." The "agents" that had been discussed in labs and demos began to integrate into everyday products and business designs, finally being treated as real infrastructure—this is what happened in 2025.The Dispatch


However, this change is not a simple story of "the future has arrived." As AI agents become more capable, the friction that inevitably accompanies social implementation—such as security, evaluation methods, standardization, governance, employment and surveillance, power, and data centers—comes to the forefront. And 2026 is likely to be a "year of the field" where we reconcile with this friction.



The Changing Definition of "AI Agents": From Academic Term to Product Specification

The term "agent" has long existed in the field of AI. Traditionally, it was discussed within an academic framework as a system that observes, infers, and acts within an environment. However, the "agents" of 2025 have been redefined to be more practical. Large Language Models (LLMs) call external tools, use APIs, and autonomously proceed with tasks—this "actionable" nature has come to the forefront.The Dispatch


One of the factors accelerating this trend is the standard mechanism for connecting LLMs with external tools. The article points out that Anthropic's Model Context Protocol (MCP), released in late 2024, became an important foundation for LLMs to "step outside the text."The Dispatch
In essence, the core of agents has shifted from "smart text" to "the ability to complete work across systems."



Milestones Shaping 2025: Competition, Standards, and the "Reinvention of the Browser"

1) Acceleration of Open Model Competition

At the beginning of 2025, China's DeepSeek-R1 emerged as an "open weight," shaking the premise of who could create high-performance models, the article recalls.The Dispatch

Additionally, major U.S. labs (OpenAI, Anthropic, Google, xAI, etc.) and Chinese tech companies (Alibaba, Tencent, DeepSeek, etc.) both advanced model releases and ecosystem expansion, with the competition taking on a "long-term battle" aspect involving geopolitics.The Dispatch


2) A World Where Agents Talk to Each Other: Agent2Agent and Standardization

Another turning point was Google's proposed Agent2Agent (A2A) protocol. While MCP focuses on "how to use tools," A2A focuses on "how agents collaborate." Both were designed to be used together, and after being donated to the Linux Foundation, the push for standardization strengthened significantly.The Dispatch


Standardization is subtle but powerful. This is because as the cost of interconnection decreases, agents transition from being "toys for some advanced companies" to "components that many companies can adopt."


3) The "Agent-Based Browser" as the Next Entry Point

By mid-2025, "agent-based browsers" began to emerge, the article lists. The idea is that the browser not only searches and reads information but also handles "execution" such as reservations and purchases.The Dispatch

This represents a change in UX as well as in authority design. In other words, if the browser "operates on your behalf," the weight of the data it handles, such as login information, payments, personal information, and browsing history, increases significantly.


4) Democratization Advances with Workflow Builders

The spread of workflow construction tools like n8n has also broadened the base of people who can "create custom agents."The Dispatch
From "automation for those who can code" to "automation that those who know the business can assemble." As this progresses, the speed of agent adoption will increase dramatically.



The Stronger They Get, the More Dangerous: "New Power" and "New Risks"

What characterized the evolution of agents in 2025 was that risks became apparent at the same pace as capabilities grew. The article touches on the case where the Claude Code agent was misused for partial automation of cyberattacks, demonstrating that "the power to automate repetitive and technical tasks also lowers the barriers to malicious activities."The Dispatch


What is even more troublesome is that the vulnerability increases as agents "connect." If a single LLM only gives incorrect answers, the damage might be limited, but as tool calls, browser operations, and collaborations with other agents accumulate, the probability of "errors manifesting as actions" increases.The Dispatch


In a world where errors manifest as actions, security cannot be just an "afterthought checklist."



Key Points for 2026: Evaluation, Governance, and "Is Bigger Always Better for Models?"

1) Redesigning Benchmarks: Measuring the "Process" as Well as the "Outcome"

Traditional benchmarks were suited for comparing the performance of individual models. However, an agent is a composite of "model + tools + memory + decision-making logic." Therefore, in 2026, "how it did it" will be more important than "the score," the article points out.The Dispatch

This is akin to saying "show your work" rather than "is the answer correct" in human terms. To create trustworthy agents, process visualization and standardization of evaluation methods are unavoidable.


2) Governance and Standard Bodies: Agentic AI Foundation

The establishment of the Agentic AI Foundation (AAIF) by the Linux Foundation in late 2025 is a sign that "the winning strategy is not proprietary standards but interoperability." The article touches on the possibility of AAIF playing a role similar to W3C.The Dispatch


While interoperability increases convenience, if the responsibility demarcation (who ensured what) is not clarified when an accident occurs, it will be difficult for the field to adopt. In 2026, "responsibility design for proliferation" will be questioned.


3) "Giant Models" VS "Small, Specialized Models"

While larger models are more versatile, they are not necessarily the optimal solution as agents. The article notes that there are many areas where small, task-specific models have an advantage.The Dispatch


In the field, "doing this reliably" is more important than "being able to do anything." In 2026, the era where users "choose models according to their purpose" will begin, and the responsibility for selection will also shift to the user side.



Remaining Social Issues: Power, Employment, Surveillance, and Regulation

The article emphasizes not only technical issues. The expansion of data centers burdens the power grid and affects local communities. In workplaces, automation progresses, raising concerns about job displacement and surveillance.The Dispatch


In terms of security, tool connections and multi-stage agentization multiply risks. The article notes the need for caution, especially regarding indirect prompt injection (where instructions embedded on the web are read by agents, leading to unintended actions).The Dispatch


Regarding regulation, there is a concern that the U.S. has limited oversight compared to Europe and China, leaving the risk of "access," "accountability," and "limit setting" unresolved as it permeates into living infrastructure.The Dispatch

Therefore, the article concludes that agents should be treated not as "mere software components" but as "socio-technical systems," requiring rigorous engineering, design, and documentation.The Dispatch



Reactions on Social Media (A Broad Overview)

The sentiment of the article's claims (i.e., agents becoming a reality in 2025 and the challenges of evaluation, standards, safety, and social implementation in 2026) is also reflected on social media. However, there is a mix of enthusiasm and calmness.


1) "Finally, I Want a 'Secretary Agent'"—Expectations Shift to "Household Chores"

On Hacker News, while agents primarily grew for developers in 2025, voices suggest that the real potential lies in non-development areas (administration, contracts, invoices, customer support, etc.). Comments expressing the sentiment "I don't need AGI, but a secretary-type agent to handle chores" are also present.Hacker News

This indicates a shift in value from "high capability" to "reducing real-world hassle."


2) "The Bottleneck Isn't Model Performance. It's Trust and Integration"—2026 as the "Year of Implementation"

Discussions on LinkedIn suggest that 2026 will be a year of "deployment" rather than "breakthroughs," with challenges lying in "reliability, integration, and workflow embedding" rather than model intelligence.LinkedIn

The article's points on "process evaluation," "governance," and "standardization" align well with the sentiment on social media.


3) "Every Year, We Just Move the Promises to Next Year"—Skepticism and Fatigue

On LinkedIn, a notable quote is "Every year, we just move the promises to next year."LinkedIn

Given the high expectations

← Back to Article List

Contact |  Terms of Service |  Privacy Policy |  Cookie Policy |  Cookie Settings

© Copyright ukiyo journal - 日本と世界をつなぐ新しいニュースメディア All rights reserved.