Company X to Prohibit Use of Its Content for AI Training! What Impact Will This Have on Japan?

2025年06月06日 20:15

Company X Completely Bans Use of Its Content for AI Model Training

――Shock and Reorganization Scenarios in Japan's Generative AI Ecosystem――

1. Introduction──The Sudden "Closing of Doors"

On June 5, 2025, social network X (formerly Twitter) revised its developer terms, completely banning third parties from using posts on X or data obtained via API for "foundation/frontier model training or fine-tuning." TechCrunch first discovered this, and The Verge followed up, causing a stir in the global AI developer community.

2. Understanding the Changes──"Reverse Engineering and other Restrictions"

The new clause was added as a single line under "Reverse Engineering and other Restrictions," but its impact is significant. Crawling and scraping via the X API are no exceptions, and the terms "for research purposes" or "for non-profit purposes" are explicitly excluded. The previously open "API culture" ensuring data portability has turned into a blockade overnight.

3. Background──Acquisition by xAI and "Grok's" Own Learning Needs

In March 2025, xAI, led by Elon Musk, acquired X for about $33 billion and brought its own LLM called "Grok" to the forefront. Company X continues to use platform data for training its own models, while pivoting to a "walled garden strategy" that closes doors to other companies. This structure aligns with the trend of monopolizing data as a "resource" and aiming for revenue by licensing it at high prices, similar to Reddit and The New York Times.

4. Global Trend──Reddit Lawsuit and the Rise of the "License Business"

In May 2025, Reddit sued Anthropic for "over 100,000 crawl attempts." While monetizing data and entering into a $200 million-level license agreement with Google, Reddit showed a tough stance against unauthorized use. X's actions accelerate this global trend of "content enclosure."

5. Severe "Data Famine" Facing Japanese AI Development Companies

The performance of large language models (LLMs) depends on the volume and diversity of data. Japanese social media data, which includes slang, dialects, and domestic topics, is essential for training Japanese language models. However, the terms of use of major domestic SNS are being revised to "AI learning not allowed," and the cost and legal risks of obtaining data will soar. As a result,

compared to large overseas companiestraining costs will be higher
there is a possibility of falling behind in model performance
opportunities for innovation by ventures will shrink

this triple burden looms.

6. Alternative Sources of Data—Public Corpora and In-house Data

As practical solutions to circumvent restrictions, options include: ① public corpora from the National Institute for Japanese Language and Linguistics, ② paid contracts with newspapers and broadcasters, ③ refining proprietary data such as in-house chat logs and FAQs, and ④ generating synthetic data. However, public corpora have diverse licenses, and even when complying with **Article 30-4 of the Copyright Act (provisions for information analysis)**, it is necessary to individually confirm secondary use clauses.

7. Current Legal Landscape—The Boundary Between Copyright Law and robots.txt

In Japan, the 2018 amendment to the Copyright Act made "reproduction for information analysis purposes" subject to rights restrictions, but whether "commercial LLM training" falls under this remains a gray area. Additionally, the Newspaper Association issued a statement on June 4, 2025, asserting that "the intention to refuse AI training indicated by robots.txt should be respected,"and clearly stated that ignoring such indications is unjust.

8. Are Individual Posts Protected by "Opt-out"?

X provides an opt-out option in user settings to "refuse learning by Grok," but the current terms are a blanket ban against "third parties,"and it should be noted that posts are still used for X's own learning.

9. Strategic Responses of Companies and Research Institutions

Early Initiation of Data License Negotiations
Inventory of Legal Risks for Contracted Datasets
Implementation of Transparency in Generative AI (Source Traceability)
Synthetic Data and High-quality Small-scale Learning "Small Data Strategy"

These are short-term measures, and in the long term,a cross-industry foundation for collaboratively maintaining Japanese open datais required.

10. Impact on Startups—Changes in Funding and Evaluation

VCs have traditionally emphasized "technological superiority = model performance," but going forward, **"how much data secured through legitimate licenses is held"** will be key to corporate value. Japanese startups need to incorporate data strategies into their pitches early and revise business plans to account for rising capital costs.

11. Dilemma in Academic Research—Open Science and Intellectual Property Protection

Universities and public research institutions are fundamentally in a position to disclose their findings, but when models are trained using corporate data,there is a risk that disclosing model parameters could violate licenses.It is essential to sign an MOU with data-providing companies and clearly define the rules distinguishing between "publicly available parts" and "non-public parts."

12. The Temperature Gap with Overseas Platforms──"Open vs. Closed"

Meta extensively uses CC-licensed web data for Llama 3, while YouTube has yet to clearly state AI learning restrictions. In the U.S., the concept of **"fair use"** serves as a certain shield, whereas in the EU, the AI Act is set to be enforced in 2026, imposing transparency obligations. The closure of X symbolizes the arrival of an era where "even in the U.S., data is not free,"and the cross-border data governance waris intensifying.

13. The Japanese Government's Position and Policy Recommendations

The Ministry of Economy, Trade and Industry (METI) includes "respect for data providers' intentions" in its draft "Guidelines for the Utilization of Generative AI," while also aiming to secure AI industry competitiveness. Moving forward,

machine-readability and free secondary use of public data
development of shared clouds/data lakes by universities and public research institutions
subsidies for data acquisition for SMEs and startups

are the three key points.

14. "Unique Data" as a Competitive Advantage──A New Value Chain

As platformers enclose data,the value of "undiscovered data" such as operational logs, supply chain data, and customer chats hidden within companies skyrockets. Japanese companies have the opportunity to refine data that is difficult for overseas entities to access due to language and business practice barriers, and to differentiate themselves globally by leveraging "niche but deep expertise."

15. Conclusion──"Data Quality and Access" Determines AI Competitiveness

The revision of X's terms of use may seem like a mere policy change at first glance, but it actually marks the beginning of a new chapter in the "data acquisition war" that fundamentally shakes the power balance of the generative AI industry. Japanese AI developers, companies, and policymakers must

diversify data procurement and manage legal risks

jointly build open data infrastructure

differentiate through the creation of unique data

These three pillars must be urgently established, or else competitiveness in the global market may be lost. Conversely, companies that overcome this crisis and achieve **"high-quality unique data × highly efficient models"** will be the winners in the next era of generative AI.

TechCrunch

The Verge

Reuters

Japan Newspaper Publishers & Editors Association

Digital Agency

Reference Article
changes its terms to bar the training of AI models using its content
Source: https://techcrunch.com/2025/06/05/x-changes-its-terms-to-bar-training-of-ai-models-using-its-content/

Company X to Prohibit Use of Its Content for AI Training! What Impact Will This Have on Japan?

Company X Completely Bans Use of Its Content for AI Model Training

1. Introduction──The Sudden "Closing of Doors"

2. Understanding the Changes──"Reverse Engineering and other Restrictions"

3. Background──Acquisition by xAI and "Grok's" Own Learning Needs

4. Global Trend──Reddit Lawsuit and the Rise of the "License Business"

5. Severe "Data Famine" Facing Japanese AI Development Companies

6. Alternative Sources of Data—Public Corpora and In-house Data

7. Current Legal Landscape—The Boundary Between Copyright Law and robots.txt

8. Are Individual Posts Protected by "Opt-out"?

9. Strategic Responses of Companies and Research Institutions

10. Impact on Startups—Changes in Funding and Evaluation

11. Dilemma in Academic Research—Open Science and Intellectual Property Protection

12. The Temperature Gap with Overseas Platforms──"Open vs. Closed"

13. The Japanese Government's Position and Policy Recommendations

14. "Unique Data" as a Competitive Advantage──A New Value Chain

15. Conclusion──"Data Quality and Access" Determines AI Competitiveness

Reference Article

The Era of Consulting AI for Politics and Shopping: What’s Happening Behind Persuasive Chatbots

The Website as a "Stage Set" - Major Transformation of E-commerce and Marketing in the Era of Generative AI

Pitfalls of the AI Era: The Birth of Copy-Paste Brain? 55% Reduction in Memory with Long-term Use of ChatGPT

"The Shock of the CEO's Words: 'Half of Us Will Lose Our Jobs to AI'"—The Day the Future of Work Began to Shift

Does AI Dependence Diminish Intelligence or Liberate It? ─ The True Nature of "Cognitive Debt" Revealed by MIT

Cookie Usage

Company X Completely Bans Use of Its Content for AI Model Training

1. Introduction──The Sudden "Closing of Doors"

2. Understanding the Changes──"Reverse Engineering and other Restrictions"

3. Background──Acquisition by xAI and "Grok's" Own Learning Needs

4. Global Trend──Reddit Lawsuit and the Rise of the "License Business"

5. Severe "Data Famine" Facing Japanese AI Development Companies

6. Alternative Sources of Data—Public Corpora and In-house Data

7. Current Legal Landscape—The Boundary Between Copyright Law and robots.txt

8. Are Individual Posts Protected by "Opt-out"?

9. Strategic Responses of Companies and Research Institutions

10. Impact on Startups—Changes in Funding and Evaluation

11. Dilemma in Academic Research—Open Science and Intellectual Property Protection

12. The Temperature Gap with Overseas Platforms──"Open vs. Closed"

13. The Japanese Government's Position and Policy Recommendations

14. "Unique Data" as a Competitive Advantage──A New Value Chain

15. Conclusion──"Data Quality and Access" Determines AI Competitiveness

Reference Article

The Era of Consulting AI for Politics and Shopping: What’s Happening Behind Persuasive Chatbots

The Website as a "Stage Set" - Major Transformation of E-commerce and Marketing in the Era of Generative AI

Pitfalls of the AI Era: The Birth of Copy-Paste Brain? 55% Reduction in Memory with Long-term Use of ChatGPT

"The Shock of the CEO's Words: 'Half of Us Will Lose Our Jobs to AI'"—The Day the Future of Work Began to Shift

Does AI Dependence Diminish Intelligence or Liberate It? ─ The True Nature of "Cognitive Debt" Revealed by MIT