Act 2 of the AI Boom: A World Shifting from GPU Shortage to "Token Shortage"

Act 2 of the AI Boom: A World Shifting from GPU Shortage to "Token Shortage"

The AI Demand Isn't Over Yet—The Day Token Economy Swallows Companies, Semiconductors, and Jobs

"Is the AI boom reaching its peak?"
From 2024 to 2025, this question has been repeatedly asked among investors, media, and corporate executives. Is generative AI just a temporary craze? Is the investment in GPUs excessive? Will companies really continue to pay for AI?

However, the world depicted in the article "AI Demand is Still Booming," published by NextBigFuture on April 25, 2026, is quite different from these doubts. It suggests not a slowdown in AI demand but rather a view that demand continues to far exceed supply. Moreover, this demand is not merely about chatbot usage or hype. It reflects the reality that companies are using AI in actual operations, having it write code, conduct research, automate analysis, and begin handling tasks that previously required large teams over extended periods with fewer people.

At the heart of this article is a statement by Dylan Patel of SemiAnalysis, known for semiconductor and AI infrastructure analysis. NextBigFuture highlights that SemiAnalysis's own AI expenditure has surged from a scale of tens of thousands of dollars the previous year to an annualized scale of $7 million. Importantly, this spending is not limited to researchers and engineers. Even non-technical staff are using Claude and code-generating AI routinely, changing the very way they conduct their work.

This is a crucial perspective when considering AI demand. The value of AI cannot be measured by "how many people tried it for free." It is determined by how much companies integrate it into their business processes, how many tokens they consume, and how much revenue, cost savings, and decision-making speed that consumption leads to. In other words, the central indicators of the AI economy are shifting from "number of users" to "token consumption" and "economic value per token."

An Era Where Execution Becomes Cheaper and the Value of Ideas is Questioned

The most provocative claim in the NextBigFuture article is that the old adage "ideas are cheap, execution is hard" is starting to crumble.

In traditional business, anyone could come up with ideas, but the ability to implement, validate, market, and continuously improve them was the differentiating factor. Creating a superior product required engineers, designers, data analysts, project managers, and salespeople, as well as time and funding.

However, as AI begins to write code, conduct research, test hypotheses, create materials, and analyze data, the cost of execution is rapidly decreasing. Of course, human judgment and quality control are necessary, but the distance to "just make it," "just investigate it," and "just test it" becomes overwhelmingly shorter.

This change presents a significant opportunity for entrepreneurs and companies. Even small teams can engage in trial and error on par with large corporations. Individuals can do things close to the analysis and development that used to be done at the departmental level. The NextBigFuture article introduces examples such as chip analysis dashboards using GPUs and U.S. power grid analysis and AI impact benchmarks being realized by small teams in a short period.

However, in this world, the importance of "what to create" increases even more. As execution becomes easier, mediocre ideas are quickly imitated and caught in price competition. Because AI can create anything, those who can pose truly valuable questions, companies with unique data or customer touchpoints, and organizations that make quick decisions will have an advantage.

In the AI era, what holds value is not mere workload but good problem setting, good data, good market understanding, and the ability to turn AI outputs into real-world profits.


The Gap Between Companies That Use Tokens and Those That Don't

Another strong argument presented in the article is the possibility that "moderate use of AI" will not be sufficient.

When it comes to AI adoption, many companies first consider cost reduction. Completing tasks that used to take eight hours in one hour. Reducing staff. Cutting outsourcing costs. This is indeed effective in the short term.

However, the article emphasizes the idea that this alone will not lead to victory. Instead of working one hour with AI and stopping, companies that produce eight or ten times the output using the same eight hours will win. In other words, the gap between companies that use AI as a "tool for ease" and those that use it as a "tool to explode production" will widen.

This view is quite harsh. Those who do not use AI, use it only in a limited way, or skimp on token consumption will be fixed in a disadvantaged position in the long term. The expression "permanent underclass" used around Dylan Patel is extreme, but the message is clear. A new gap will emerge between those who fully utilize AI to generate value and those who are replaced by AI.

Of course, caution is needed in this discussion. It's not a simple matter of everyone using AI without limits. Issues of confidential information, misinformation, copyright, security, and quality assurance remain. However, if companies excessively restrict AI use and keep their operations tied to traditional methods, they are likely to fall behind in speed compared to competitors who actively utilize AI.

The management challenge in the AI era has shifted from "whether to use AI" to "which tasks to use which models for, with what authority, and to what extent."


The Semiconductor Bottleneck Behind the Demand Explosion

As AI demand increases, real-world constraints come to the forefront. Although models appear to run in the cloud, behind the scenes are GPUs, CPUs, memory, networks, power, cooling equipment, data centers, semiconductor manufacturing equipment, and advanced packaging.

The NextBigFuture article points out that supply constraints are occurring at various layers of AI infrastructure, particularly in memory, TSMC, CPU, GPU lifespan, optical communication, copper foil, PCBs, and manufacturing equipment. AI demand is not just pushing up NVIDIA's GPUs. As AI agents and inference processing spread, the load extends to CPUs, DRAM, HBM, storage, and network equipment.

This is a significant change. The initial AI investment boom was mainly discussed as GPU demand needed for training large-scale models. However, as AI becomes embedded in practical use, the center of demand shifts to inference, or daily use. The more companies use AI for code generation, search, analysis, customer support, sales support, design, and robot control, the more continuous computing resources are needed.

And inference demand is different from one-off massive training. It occurs every day, every hour, every second. As users increase and AI agents begin to autonomously handle tasks, token consumption grows beyond the speed of human input. AI calls another AI, generates code, tests, modifies, searches, and summarizes again. Such agent-like workflows consume far more computing resources than traditional chat usage.

As a result, the semiconductor supply chain is widely pressured. Shortages of DRAM and HBM, production slots for advanced processes, advanced packaging like CoWoS, server CPUs, data center power, transmission networks, and cooling equipment. AI demand is not only a software industry issue but also a manufacturing, energy, and geopolitical issue.


Anthropic, Claude Code, and "Model Hoarding"

The article also touches on Anthropic's revenue growth and the expansion of Claude Code usage. NextBigFuture notes that Anthropic's ARR has significantly increased, and demand is strong enough that it sells even when prices or rate limits are adjusted. However, since these figures include estimates and statements from related parties about non-public companies, they should not be treated like official financial statements.

Still, the direction is understandable. AI coding tools are cost-effective for companies. Code generation, bug fixing, test creation, migration work, and internal tool development are areas where time savings from AI can be directly measured. Considering the hourly wages and hiring costs of engineers, spending on high-performance models is easily justified.

The issue here is access to the best models. If demand exceeds supply, AI companies do not need to provide the highest-performance models to all users under the same conditions. Companies that pay high prices, enter into long-term contracts, or are strategically important customers can be prioritized.

The "model hoarding" mentioned in the NextBigFuture article describes this situation. The best-performing models and large inference slots may be preferentially allocated to financially powerful companies. If this happens, it runs counter to the ideal of AI democratization, concentrating AI capabilities in a few large or high-revenue companies.

This has happened in the cloud era, but it may become even more serious in the AI era. This is because the difference in access to AI models directly affects product development speed, sales efficiency, R&D capability, customer responsiveness, and even employment structures.

Robotics Could Create the Next Wave of Demand

The article also touches on humanoid robots and robotics. Current robot AI still faces challenges in data efficiency when linking vision, language, and action models. However, if a breakthrough occurs that allows tasks to be learned from a few demonstrations, AI demand in the physical world could rapidly expand.

This point is very significant when considering the future of AI demand. Much of the current AI demand occurs in the digital space, such as text, code, images, videos, search, and analysis. However, if robots begin to be used in warehouses, factories, homes, construction, healthcare, agriculture, and logistics, AI will also enter physical tasks.

Physical world tasks have a larger market scale than digital tasks. There are vast amounts of work done by hand, work involving movement, and work that responds to environmental changes. If robots can learn these with few-shot learning, a large amount of computing resources will be needed for learning, inference, simulation, and control per robot.

In other words, the advancement of robotics could become the second wave of GPU and token demand. After text AI, physical AI will come. If that happens, AI infrastructure demand could exceed current expectations.


Reactions in Social Media and Comments: Optimism, Caution, and China Threat Theory

 

Reactions online to this article and related statements by Dylan Patel cannot be simply divided into approval or disapproval. Broadly, there are three reactions.

First is the optimistic reaction from investors and tech-related individuals. On X, there are posts indicating that AI spending is not just theoretical expectations but is reflected in orders and outlooks for companies related to TSMC, ASML, memory, and data centers. The view is that AI demand continues, the semiconductor cycle is not over, and it is spreading beyond GPUs to CPUs, memory, and power infrastructure. This stance is close to that of the NextBigFuture article.

Second is the anxiety from workers and general users. There are strong concerns that AI might take jobs, that companies might proceed with staff reductions for AI investment, and that AI data centers might strain local power and water resources. Reports related to Pew Research and NBC News also show that caution towards AI is rising in the U.S. On social media, posts viewing AI as a "tool for productivity improvement" clash head-on with those seeing it as a "mechanism that destroys employment and creativity."

Third is the reaction concerning competition with China. In the comments section of the NextBigFuture article, there are comments suggesting that the reason Chinese models lag behind the U.S. is more about chip supply constraints than the technology itself, and if China overcomes these constraints within a few years, U.S. AI companies might struggle in low-price competition. This perspective views AI competition not just in terms of model performance but also as an issue of semiconductor supply, national industrial policy, and price competitiveness.

This comment is merely the opinion of one reader but contains important points. The current AI hegemony involves models, data, semiconductors, cloud, power, and capital markets as a whole. Even if U.S. companies lead with high-performance models and high-price contracts, if Chinese companies catch up with efficiency and low pricing, the market structure could change. The impact of low-cost, high-efficiency models has already been widely recognized since DeepSeek.


Will Backlash Against AI Really Spread?

The NextBigFuture article also touches on the possibility of large-scale protests against AI companies, especially Anthropic and OpenAI, by the fall of 2026. This is a rather strong prediction, but it is backed by a shift in public opinion.

The anxiety towards AI is not just a vague fear of new technology. Specific issues such as employment, copyright, education, misinformation, surveillance, military use, data center construction, and power consumption are increasing. The more AI company leaders talk about "the world changing significantly" and "many jobs changing," the more the general public tends to feel anxiety rather than expectation.

Furthermore, the difficulty in seeing the benefits of AI is also a problem. Companies reduce costs with AI, investors enjoy rising stock prices, and engineers and executives experience productivity improvements. Meanwhile, what general consumers might feel is the automation of customer support, mass production of content, job insecurity, and concerns about electricity bills and local infrastructure.

For AI companies to gain social support, it is not enough to just talk about "amazing things happening in the future." They need to show examples that specifically improve current life, such as in healthcare, education, small business support, administrative procedures, disability support, and R&D. They must also face employment transition and retraining, transparency in data use, and consensus building with local communities.

The more powerful the technology becomes, the greater the social responsibility for explanation.


"Phantom GDP" and Invisible Productivity

One of the intriguing concepts in the article is "Phantom GDP." It is the idea that while actual production and value increase due to AI, the cost drops significantly, making it not fully reflected in traditional GDP statistics.

For example, consider an analysis that previously took 200 people a year to complete, now being executed by a few people in a few weeks. A significant social value is created. However, if the labor costs and outsourcing fees paid for that work decrease significantly, nominal economic activity might appear to shrink.

This issue also occurred in the internet era. Free search, free maps, free translation, and free or low-cost software greatly improved quality of life but were difficult to fully reflect in GDP. AI might further amplify this problem.

Within companies, productivity improvements from AI are clearly visible. However, in macro statistics, this value is hard to capture. As a result, a strange situation could arise where economic indicators seem stagnant, yet actual intellectual production and decision-making speed are skyrocketing.

This "invisible productivity" complicates policy decisions. It necessitates changing how employment statistics, wages, GDP, corporate profits, prices, and capital investment are interpreted.


What Should Companies and Individuals Do?

If AI demand truly continues to expand, the actions required of companies and individuals are clear.

Companies must not stop AI use at the experimental stage. It's not enough to introduce an internal chatbot and call it a day; they need to review what can be amplified with AI in sales, development, accounting, legal, research, marketing, customer support, and business planning.

At the same time, AI spending should be managed not merely as a cost but as an investment that converts into revenue and productivity. Which departments are using how many tokens, and what results are they achieving? What output has increased due to AI? Can the time saved be reinvested in higher value-added work? Such indicators will be necessary.

For individuals, avoiding AI is not an option. The important thing is not just to finish work quickly with AI