Conspiracy Theorists Revealed by "How They Speak" Rather Than "What They Speak"? Traces of Language Shown in 500 Million Reddit Posts

Conspiracy Theorists Revealed by "How They Speak" Rather Than "What They Speak"? Traces of Language Shown in 500 Million Reddit Posts

Is Conspiracy Theory Found in "Topics" or "Narratives"?—AI Analyzes 500 Million Reddit Posts

Do people who talk about conspiracy theories use distinctive language even when they are not discussing conspiracy theories?

A research team led by the Polytechnic University of Milan tackled this question through large-scale data analysis. They analyzed over 500 million comments posted on the giant U.S.-based bulletin board-style social network, Reddit. The researchers examined how users who participated in the representative conspiracy theory community "r/conspiracy" spoke in general communities such as news, science, movies, music, cooking, DIY, and animal photos.

The conclusion of the study is provocative. Users participating in conspiracy theory communities demonstrated certain linguistic characteristics even in places where they were not discussing conspiracy theories. Moreover, these characteristics could be identified with high accuracy by AI models.

However, what the study revealed is not a simple "list of conspiracy theorist words." More importantly, the tendency towards conspiracy theories does not manifest uniformly across all topics but changes according to the culture and conversational rules of the community they are participating in.

In other words, the language surrounding conspiracy theories appears as a "narrative" that changes shape according to the setting, rather than a fixed label.


The Unease of "Neutral Conversations" Seen from 500 Million Posts

The research team analyzed comments posted over a decade of activity on Reddit, focusing on more than 20 mainstream communities. They compared users who had participated in r/conspiracy with general users who had not.

The key point is that the researchers did not look at posts about conspiracy theories themselves. The analysis focused on statements made in mainstream communities. For example, movie reviews, cooking discussions, music preferences, reactions to scientific news, and casual everyday conversations. Words used in contexts that were not overtly political or conspiratorial were examined.

As a result, the machine learning model was able to distinguish between conspiracy community participants and non-participants with an average accuracy of 87% within each community. This figure is difficult to explain as mere coincidence or bias towards specific topics.

Particularly noticeable were expressions indicating anger or anxiety, words with confrontational or aggressive tones, and vocabulary related to illness or death. The research team extracted psycholinguistic features and aggregated them on a user basis for the model to learn. This means they did not judge based on a single word but statistically handled the emotional and cognitive tendencies contained in the entire statement.

For example, even in the same "cooking" discussion, one person might calmly talk about recipes and flavor innovations. In contrast, another might speak with a mix of anger, distrust, and aggressive expressions. The study captured these differences in conversational tone and the way worldviews seep through.


"Conspiracy Theorist Words" Do Not Exist

What makes this study intriguing is that while it found common characteristics among conspiracy community participants, it also demonstrated that a "universal detection model" does not work effectively.

According to the research team, trying to identify all communities with one large model resulted in poorer performance compared to individual models created for each community. The difference reached up to 17 points.

This is significant.

This is because language on social media is not determined solely by the inner thoughts of the poster. In news communities, explanatory words increase, in humor communities, sarcasm and jokes increase, and in hobby communities, calm vocabulary is more common. Each place has its own "atmosphere."

Users participating in conspiracy communities also change their expressions to match that atmosphere. The research team emphasizes this point, suggesting that analysis should be context-specific rather than searching for a single "conspiracy-like word."

This has significant implications for social media moderation and risk detection. Simple NG word methods or uniform AI detection using the same criteria everywhere may misinterpret reality. What is natural expression in one community may become a strong identification signal in another.


Were There Signs Before Participation?

Another noteworthy point is the claim that these linguistic characteristics were observed even before users participated in conspiracy communities.

The study also analyzed statements made in mainstream communities before users explicitly participated in r/conspiracy. The patterns used for identification were found to be relatively stable rather than suddenly emerging just before participation.

This is difficult to explain with a simple causal relationship like "language changed because they were exposed to conspiracy communities." Rather, it suggests a self-selection aspect where users who originally have certain tendencies towards distrust or emotional expression are later drawn to conspiracy communities.

Of course, this alone cannot definitively determine an individual's psychology or future behavior. Posts on social media are influenced by many factors, including personality, age, culture, political views, living environment, mood of the day, and the atmosphere of the posting destination. The research team also takes a cautious stance against generalization without context.

Nevertheless, this study is highly suggestive in showing the possibility that the accumulation of casual words is linked to participation in online communities and changes in the information environment.


Social Media Reactions: Limited Spread, But Heavy Issues

 

The article on Phys.org introducing this study did not immediately cause a major uproar or large-scale debate. The number of shares on the article page was limited, and there were not many comments at the time of confirmation.

On X, a Japanese science information account introduced the article with the theme of "conspiracy theorist nature seeping through words," but the number of reactions was still small within the searchable range. The paper title also flowed to Mastodon-based arXiv auto-post bots and news aggregation sites, and at present, it appears to be in the initial diffusion stage through researchers, science news readers, and automated feeds rather than a "major debate among general users."

On LinkedIn, authors Francesco Pierri and Francesco Corso mentioned this paper as an accepted work at ACL 2026 and related research. This also gives a strong impression of being shared within the research communities of computational social science, natural language processing, and online safety rather than causing a public uproar.

However, if this research is widely read in the future, several reactions can be expected on social media.

One is the stance supporting enhanced moderation. Conspiracy theories, radicalization, medical misinformation, and false election-related information can impact real society. There is value in early identification of high-risk community formation and platform-side responses tailored to the context.

On the other hand, concerns about freedom of expression and privacy may also arise. The idea of inferring future community participation from "posts not discussing conspiracy theories" may feel surveillance-like to some. Even if for research purposes, if such technology is implemented, issues of misjudgment and labeling cannot be avoided.

Especially words related to "anger," "anxiety," and "death" are naturally used by people unrelated to conspiracy theories. People talking about illness experiences, sharing sadness, expressing dissatisfaction with politics, or getting angry about social issues should not be collectively treated as suspicious.

What this study should indicate is not a tool for determining individuals as "conspiracy theorists," but rather a clue to carefully capture the linguistic environment and risk changes of entire communities.


AI Detection Is Not All-Powerful

The figure of an average 87% is eye-catching, but conversely, misjudgments remain. Moreover, misjudgments on social media are not merely statistical errors. They can affect users' opportunities for expression in the form of account restrictions, post deletions, visibility reductions, and exclusion from communities.

Also, the high accuracy achieved in the study was within the context of a comparative experiment designed under specific conditions. In actual social media operations, new slang, sarcasm, memes, cultural differences, language differences, bots, trolls, and political campaigns are mixed. Features effective at one time may not be applicable at another.

Furthermore, users may change their language if they are aware of detection. If AI surveillance becomes widespread, overt expressions may decrease, and encrypted phrases or in-group jargon may increase. This is a recurring issue in extremist community and spam countermeasures.

Therefore, the "context-sensitive intervention" emphasized by the research team becomes important. Understanding the norms of each community and responding with transparency, explainability, and a mechanism for objections rather than simple detection is required.


Are Neutral Places Truly Neutral?

Another question posed by this study is the issue of what constitutes a "neutral community."

We tend to perceive political communities and conspiracy forums as special places. There, extreme discourses fly around, emotions run high, and conflicts arise. In contrast, communities like cooking, music, movies, DIY, and animal photos are considered more peaceful and neutral places.

However, the study shows that it is not that simple. Users do not exist in just one community. In one place, they may comment on dog photos, in another, express anger at news, and in yet another, empathize with conspiratorial interpretations. Online personas are shaped while moving across multiple spaces.

In other words, the conspiratorial worldview is not confined to dedicated conspiracy spaces. Emotional habits and structures of distrust can seep into everyday conversations, hobby discussions, and brief reactions to news.

This is not about doubting individuals. Rather, it is about understanding how interconnected online spaces are. On social media, hobbies, politics, health, news, entertainment, anger, anxiety, and jokes mix within the same account. How to handle this mixture will be a challenge for future platform design.


The Study Does Not Aim to "Hunt for Dangerous Individuals"

This type of research can take a dangerous turn if handled incorrectly. Expressions like detecting "conspiracy theorist nature" from posts can easily sound like surveillance technology or profiling.

However, what this study truly indicates is not a method for condemning individuals but a method for understanding the linguistic structure of online communities. The important thing is to know how conspiracy theories spread while being linked to emotions, vocabulary, and community culture.

Researchers emphasize the need for context-sensitive analysis and intervention rather than a universal detector. This is an important perspective for platform operators, media, and users alike.

Because conspiracy theories are not simply "wrong information." They involve anger, anxiety, alienation, distrust, and a sense of belonging to a community. Simply deleting misinformation does not reveal why people are drawn to it.


Words Signal Change Before Affiliations Do

Words on social media are not just about conveying information. They reflect whom you trust, what you fear, and which community you feel safe in.

This study suggests that participation in conspiracy communities may be related to pre-existing linguistic and psychological tendencies rather than a sudden downfall. Of course, this is not determinism. Just because someone uses certain words does not mean they will turn to conspiracy theories.

Even so, when viewed through vast social media data, patterns that are not visible in individual posts emerge. Words tinged with anger or anxiety, references to illness or death, and confrontational narratives. Each of these is a daily expression, but when accumulated, they statistically link to participation in certain communities.

The challenge now is how to use this knowledge. If used for surveillance or exclusion, social media will deepen distrust. However, if used to understand the context of each community, identify radicalization or isolation early, and design healthier dialogues, it may provide clues to improve online spaces.

Conspiracy theories are not made up of specific words alone. They are born from overlapping people's anxieties, anger, desires for belonging, and the atmosphere of the community. What this study revealed is that this complex overlap is also left in our casual words.



Source URL

Phys.org: Refer to the research overview by the Polytechnic University of Milan, analysis of over 500 million Reddit comments, average identification accuracy of 87%, and the positioning of conspiracy community research on social media.
https://phys.org/news/2026-06-distinctive-language-reveals-conspiracy-community.html

arXiv Paper "Among Us: Language of Conspiracy Theorists on Mainstream Reddit": Original research paper. Refer to 500 million comments, 10 years of Reddit activity, over 20 mainstream communities, comparison of community-specific and overall models, and details of psycholinguistic features.
https://arxiv.org/abs/2506.05086
https://arxiv.org/html/2506.05086v2

EurekAlert! Release: Research announcement by the Polytechnic University of Milan. Refer to researcher comments, DOI, ACL 2026 acceptance, and mention of related Jeffrey Epstein case studies.
https://www.eurekalert.org/news-releases/1131478

Francesco Pierri's LinkedIn Post: Refer to the author's report on ACL 2026 acceptance and explanation that the research indicates the need for "context-dependent language patterns" and "community-specific moderation."
https://www.linkedin.com/posts/francesco-pierri_nice-coming-back-after-the-easter-break-with-activity-7447191919157280768-zuGL

Francesco Corso's LinkedIn Post: Refer to the author's introduction of related research. Explanation that conspiracy community participants exhibit recognizable language patterns even in mainstream Reddit spaces.
https://www.linkedin.com/posts/francesco-corso-130299132_cs2italy-activity-7463504662990630913-S7W5

X Post (Sci佇 Bookends): Refer to an example of early SNS introduction in the Japanese-speaking world. At the time of confirmation, it was limited to article introduction-type reactions rather than large-scale debates.
https://x.com/endBooks/status/2064517821913292819

Gianmarco De Francisci Morales' Research List: Refer to the ACL 2026 publication information of the relevant paper and the research context of related online conspiracy theories, QAnon, and TikTok studies.
https://gdfm.me/research/