The Signpost


Recent research

Two very different encyclopedias

Contribute   —  
Share this
By Tilman Bayer and Mitchsavl


A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Grokipedia, the AI based online encyclopedia launched XAI in October 2025 to counter perceived bias on Wikipedia, continues to attract researchers' attention (see also our previous coverage: "Comparing comparisons of Grokipedia vs. Wikipedia by three different research teams").

Comparing Political Framing in Wikipedia and Grokipedia

[edit]
Reviewed by Mitchsavl

A recent paper titled "Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics"[1] provides a comparative analysis on "semantic framing, political orientation, and content prioritization". The study concluded that both encyclopedias were generally left-wing, with Grokipedia showing a small right-wing bias on contentious topics. They also found that later sections within articles had greater differences than the lead.

...these findings challenge the widespread perception of Grokipedia as an extreme right-leaning encyclopedia, instead suggesting broadly comparable tendencies between the two platforms in their treatment of politically controversial topics, while still indicating a modest but consistent right-leaning bias in Grokipedia relative to Wikipedia.
— Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics

The study selected six controversial topics, which were the most divisive in polling data from Gallup: abortion, cannabis legalization, climate change, gender identity, gun control, and immigration. Across all these topics, Grokipedia was determined to be shifted towards the right compared to Wikipedia, with cannabis legality and gun control averaged a right wing bias.

"Wikipedia and Grokipedia: A Comparison of Human and Generative Encyclopedias"

[edit]
Reviewed by Tilman Bayer

Another recent preprint,[2] by six researchers from Sapienza University of Rome presents "a comparative analysis of Wikipedia and Grokipedia" based on a much larger sample, finding that

"Inclusion is non-uniform: pages with higher visibility and greater editorial conflict in Wikipedia are more likely to appear in Grokipedia. For included pages, we distinguish between verbatim reproduction and generative rewriting. Rewriting is more frequent for pages with higher reference density and recent controversy, while highly popular pages are more often reproduced without modification. [...] Across multiple topical domains, including U.S. politics, geopolitics, and conspiracy-related narratives, narrative structure remains largely consistent between the two sources. Analysis of lead sections shows broadly correlated framing, with localized shifts in laudatory and conflict-oriented language for some topics in Grokipedia."

Like the Cornell researchers whose paper was covered in our previous issue, the authors detected Wikipedia-sourced articles when scraping Grokipedia:

We consider a Grokipedia page to be not rewritten if it contains the standard Creative Commons footer. Specifically, we determine whether a Grokipedia article is kept unchanged by checking for the presence of the following text at the bottom of the page: “The content is adapted from Wikipedia, licensed under Creative Commons Attribution-ShareAlike 4.0 License”.

However, their assumption that those articles are "not rewritten" are somewhat in contrast to findings of the Cornell team, who calculated a "mean chunk similarity" score between corresponding Grokipedia and Wikipedia articles which at 0.90 was higher than for those articles without that footer, but still below a perfect 1.0 similarity score.

"Content framing scores in Grokipedia and Wikipedia articles across U.S. Politics, Geopolitics, and Conspiracy-related pages. Top: fraction of sentences in the lead section that show praise, admiration, or glorification. Bottom: fraction of sentences in the lead section that focus on disputes, disagreements, or controversies. Color intensity is proportional to the difference between the two fractions, while point shape for pages in U.S. Politics refers to their political leaning. The dashed line represents the quadrant bisector, corresponding to an equal fraction on both platforms. Only a subset of pages is labeled for visual clarity, and among these, some are shortened to improve readability. While scores tend to be weakly or moderately correlated, noteworthy outliers emerge, especially among U.S. Politics pages." (Figure 4 from the paper)


"Grokipedia increases human editing activity" on Wikipedia

[edit]
Reviewed by Tilman Bayer

A preprint titled How AI Reshapes Human Content Creation: The Case of Wikipedia[3] by two economists from Wake Forest University offers a surprising conclusion:

"We [...] examin[e] the short-run impact of the introduction of Grokipedia, an AI-generated online encyclopedia operated by xAI, which provides automated summaries that could either substitute for human editing or draw in new contributors. We develop a simple theoretical framework in which AI entries can redirect user attention and stimulate human editing through novel framing, yielding ambiguous effects on UGC [user generated content] ex ante. Using a new panel dataset covering 1.4 million Wikipedia pages of notable individuals, we exploit the fact that only a subset have comparable Grokipedia entries to estimate the causal effect of AI on subsequent human contributions, constructing matched samples of treated and untreated pages within occupational fields. We find a consistent and surprising result: the availability of Grokipedia increases human editing activity. Page views also rise, suggesting that AI entries act as an attention amplifier rather than a pure substitute for Wikipedia content. Exploiting variation in the semantic similarity between Grokipedia entries and their corresponding Wikipedia articles, we further show that pages with lower similarity experience larger increases in editing after Grokipedia’s launch, consistent with the model’s predictions. [...]"

The paper's "Introduction" section points out that

Theoretically, the effect of Grokipedia on Wikipedia’s UGC is ambiguous. AI may act as a substitute: if users rely on Grokipedia entries instead of Wikipedia, the reduced traffic and diminished perceived value of contributing may depress human editing activity. But AI may also act as a complement: users may draw information from Grokipedia to improve or update Wikipedia pages, or the publicity surrounding a new AI platform may direct attention toward existing Wikipedia entries, leading to more edits. The competition for viewers may also elicit greater effort from Wikipedia contributors. Which force dominates is ultimately an empirical question.

(The authors cite this edit as a concrete example of how information on Grokipedia may inspire activity on the corresponding Wikipedia article.)

The paper's statistical analysis focuses on

a new panel dataset of approximately 1.4 million Wikipedia pages of notable individuals across five occupational domains—Academia, Culture, Leaders, Politics, and Sports. Only about 170,000 of these pages (roughly 12%) are covered by Grokipedia at launch, while the remaining 88% do not receive an AI-generated entry.

To assess the impact of Grokipedia's October 2025 launch on these Wikipedia articles, the authors compare views and edits as follows:

Because Grokipedia coverage is concentrated among highly visible pages and baseline visibility varies systematically across occupations, we construct matched samples using Mahalanobis-distance nearest-neighbor matching within occupational fields. Treated pages—those with a Grokipedia entry—are paired with the closest untreated pages based on pre-treatment views, editing histories, long-run readership, and page characteristics, approximating the counterfactual trajectory each treated page would have followed absent Grokipedia. We then estimate treatment effects using a Difference-in-Differences framework, which compares changes in views and edits before and after Grokipedia’s launch between treated pages and their matched controls.

The analysis of post-launch views and edits is confined to a rather short timespan of just three weeks (October 27–November 16). The authors justify this "focus on short-run outcomes" by observing that

Beginning on October 27, 2025—the day the platform went live—traffic spiked abruptly, reaching over 500,000 daily visits worldwide during its first week, with more than 100,000 daily visits per day originating from the United States alone. Peak attention occurred immediately after launch, exceeding two million global visits on October 28, before declining in the following days.

Briefly

[edit]

Other recent publications

[edit]

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

AI summaries "led to more liberal opinions compared with Wikipedia"

[edit]

From the abstract:[4]

"Participants read Wikipedia or GPT-4o summaries of two historical events [the Seattle General Strike and the Third World Liberation Front strikes of 1968], with AI summaries maintaining factual accuracy while exhibiting different types of framing biases. Default AI summaries led to more liberal opinions compared with Wikipedia, demonstrating the persuasive capability of LLM's latent biases. Summaries purposefully induced with a liberal framing also led to more liberal opinions, regardless of readers’ ideologies. Summaries constructed with a conservative framing produced conservative shifts primarily among conservative readers."

See also a thread by one of the paper's authors


"WikIPedia: Unearthing a 20-Year History of IPv6 Client Addressing"

[edit]

From the abstract:[5]

"When Wikimedia users make edits without signing into an account, their IP addresses are used in lieu of a username. Wikimedia site dumps therefore provide researchers with over two decades worth of timestamped client IPv6 addresses to understand address assignments and how they have changed over time and space.
In this work, we extract 19M unique IPv6 addresses from Wikimedia sites like Wikipedia that were used by editors from 2003 to 2024. We use these addresses to understand the prevalence of IPv6 in countries corresponding to Wikimedia site languages, how IPv6 adoption has grown over time, and the prevalence of EUI-64 addressing on client devices like desktops, laptops, and mobile phones."

From the paper:

"The majority (∼64%) of the IPv6 addresses that are logged in Wikimedia edits appear only once"


"Knowledge, neo-liberalism and mediatization: The crystal of Wikipedia"

[edit]

From the abstract:[6]

"This article presents neo-liberal notions of knowledge and market and explains why this is important for the functioning of digital platforms. Neo-liberals are concerned with everyday knowledge of the common people, their mental states and feelings, not intellectual knowledge. [...] Hayek defines market as a communication system that is digesting dispersed information. Millions of minds are doing data generation and processing. That way, neo-liberals see all digital platforms, including Wikipedia, as markets. Classical encyclopaedias are centrally controlled and expert driven, while neo-liberal markets create knowledge through crowds’ ‘voluntary exchange’ and ‘spontaneous cooperation’. The fundamental difference is that encyclopaedias were an Enlightenment project, while Wikipedia is producing recycled intellectual and layman’s knowledge without any political or revolutionary engagement."


"The negotiation of pronominal address on talk pages of the German, French, and Italian Wikipedia"

[edit]

This paper found that German and Italian "wikiquette" stipulates the informal "du"/"tu" among editors (instead of the more formal "Sie"/"Lei"), whereas French Wikipedia lacks consensus on "vous" vs. "tu". From the abstract:[7]

"This paper asks [...] how the appropriate use of address pronouns is negotiated on talk pages of the German, French, and Italian Wikipedia. The talk pages of Wikipedia share features of CMC [computer-mediated communication] genres such as a dialogic structure and an informal writing style with non-standard language. There are two types of Wikipedia talk pages, whose data are considered in this study based on the multilingual corpora by the Leibniz Institute for the German Language: article talk pages, where authors negotiate online encyclopedic content, and user talk pages, where the contributions of individual authors are discussed. These two types of talk pages will be analysed for the study."


"Mass collaboration or curatorship? The functioning of Wikipedia needs both"

[edit]

From the abstract:[8]

"Using the complete dataset of the English, Spanish, and Italian versions of Wikipedia (2001–2020), we analyzed metrics such as the number of articles, creations, or edits performed by users. We calculated their distributions, adapted the Gini index to measure participation inequalities and employed network science methods to understand user-edit interactions. [...]


Our analysis confirms significant disparities in content generation and engagement, emphasizing content editing. However, we demonstrate that these differences coexist with extensive collaboration. Specifically, our findings reveal that disparities in participation levels and collaborative editing complement each other. Curatorial leadership by a central group of contributors is extremely collaborative, while occasional contributors intervene flexibly in specific contexts."

From the "Conclusion" section:

"Our first result has been to confirm the existence of strong inequalities in content production and participation that were already highlighted in the literature, focusing in particular on content editing. However, we have also shown how, despite such forms of gatekeeping of the core group of contributors, Wikipedia’s users who manage the vast majority of the edits carry out this task in a largely collaborative manner. That is to say, our analysis suggests that inequalities in the level of participation and high levels of collaboration are not antithetical, but rather mutually reinforcing building blocks of Wikipedia."


"WETBench: A Benchmark for Detecting Task-Specific Machine-Generated Text on Wikipedia"

[edit]

From the abstract:[9]

We introduce WETBench, a multilingual, multi-generator, and task-specific benchmark for MGT detection. We define three editing tasks empirically grounded in Wikipedia editors’ perceived use cases for LLM-assisted editing: Paragraph Writing, Summarisation, and Text Style Transfer, which we implement using two new datasets across three languages. For each writing task, we evaluate three prompts, produce MGT across multiple generators using the best-performing prompt, and benchmark diverse detectors.We find that, across settings, training-based detectors achieve an average accuracy of 78%, while zero-shot detectors average 58%. These results demonstrate that detectors struggle with MGT in realistic generation scenarios [...]

From the "Experimental setup" section:

We generate MGT using four multilingual models from two families: proprietary and open-weight. [...] For proprietary models, we use GPT4o mini [...] and Gemini 2.0 Flash. For openweight models, we select Qwen2.5-7B-Instruct and Mistral-7B-Instruct. [...]

We evaluate six detectors from three different families: [...] Specifically, we use XLM-RoBERTa [...] and mDeBERTa [...] as training-based detectors, which we fine-tune with hyperparameter search; Binoculars [...], LLR [...], and FastDetectGPT (White-Box) [...] as zero-shot white-box detectors; and Revise-Detect [...], GECScore [...], and FastDetectGPT (Black-Box) [...] as zero-shot black-box detectors.

The proprietary detectors Pangram and GPTZero are not mentioned in the paper. (A recent investigation by Wiki Edu "found [Pangram] to be highly accurate for Wikipedia text".[supp 1])

See also our review of an earlier paper by different authors that had relied on Binoculars (and GPTZero) for its conclusions: "'As many as 5%' of new English Wikipedia articles 'contain significant AI-generated content'"

References

[edit]
  1. ^ Eibl, Philipp; Coppolillo, Erica; Mungari, Simone; Luceri, Luca (2026-01-21), Is Grokipedia Right-Leaning? Comparing Political Framing in Wikipedia and Grokipedia on Controversial Topics, arXiv, doi:10.48550/arXiv.2601.15484
  2. ^ Hadad, Ortal; Loru, Edoardo; Nudo, Jacopo; Bonetti, Anita; Cinelli, Matteo; Quattrociocchi, Walter (2026-02-05), Wikipedia and Grokipedia: A Comparison of Human and Generative Encyclopedias, arXiv, doi:10.48550/arXiv.2602.05519
  3. ^ Leung, Tin Cheuk; Strumpf, Koleman S. (2025-12-03), How AI Reshapes Human Content Creation: The Case of Wikipedia, Rochester, NY: Social Science Research Network, doi:10.2139/ssrn.5853062 ("Last revised: 30 Jan 2026")
  4. ^ Shu, Matthew; Karell, Daniel; Okura, Keitaro; Davidson, Thomas R. (2026-02-27). "How latent and prompting biases in AI-generated historical narratives influence opinions". PNAS Nexus. 5 (3). doi:10.1093/pnasnexus/pgag022. PMID 41783460.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  5. ^ Rye, Erik; Levin, Dave (2025-12-09), WikIPedia: Unearthing a 20-Year History of IPv6 Client Addressing, arXiv, doi:10.48550/arXiv.2512.08808
  6. ^ Mlađenović, Nikola (2025-10-31). "Knowledge, neo-liberalism and mediatization: The crystal of Wikipedia". Empedocles: European Journal for the Philosophy of Communication. doi:10.1386/ejpc_00066_1. Closed access icon
  7. ^ Flinz, Carolina; Gredel, Eva; Herzberg, Laura (2025-06-30). "The negotiation of pronominal address on talk pages of the German, French, and Italian Wikipedia". Exploring digitally-mediated communication with corpora. De Gruyter.
  8. ^ Pilati, Federico; Sacco, Pier Luigi; Artime, Oriol (2025-08-05). "Mass collaboration or curatorship? The functioning of Wikipedia needs both". Online Information Review. 49 (8): 122–133. doi:10.1108/OIR-10-2023-0515. ISSN 1468-4527.
  9. ^ Quaremba, Gerrit; Black, Elizabeth; Vrandecic, Denny; Simperl, Elena (August 2025). "WETBench: A Benchmark for Detecting Task-Specific Machine-Generated Text on Wikipedia". Proceedings of the 2nd Workshop on Advancing Natural Language Processing for Wikipedia (WikiNLP 2025). Vienna, Austria: Association for Computational Linguistics. pp. 10–30. doi:10.18653/v1/2025.wikinlp-1.6. ISBN 9798891762848. {{cite conference}}: Unknown parameter |editors= ignored (|editor= suggested) (help)
Supplementary references and notes:
  1. ^ Davis, LiAnna (2026-01-29). "Generative AI and Wikipedia editing: What we learned in 2025". Wiki Education.


Signpost
In this issue
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.




       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0