The Signpost

Recent research

Detecting spam, and pages to protect; non-anonymous editors signal their intelligence with high-quality articles

Contribute  —  
Share this
By Matthew Sumpter and Tilman Bayer

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"Protecting the Web from Misinformation" by detecting Wikipedia spammers and identifying pages to protect

Reviewed by Matthew Sumpter

This book chapter [1] discusses general trends in misinformation on the web. Misinformation can take many forms including vandalism, spam, rumors, hoaxes, counterfeit websites, fake product reviews, clickbait, and fake news. The chapter briefly describes each subtopic and presents examples of them in practice. The following section details a comprehensive set of NLP and network analysis studies that have been conducted both gain further insight into each subtopic, as well as combat them.

The chapter concludes with a case study based on the authors' research to protect Wikipedia content quality. The open editing mechanism of Wikipedia is ripe for exploitation by bad actors. This occurs mainly by vandalism, but also through page spamming and the dissemination of false information. To combat vandalism, the authors developed the "DePP" system, which is a tool for detecting which Wikipedia article pages to protect. DePP achieves 92.1% accuracy across multiple languages in this task. This system is based on the following base features: 1) Total average time between revisions, 2) Total number of users making five or more revisions, 3) Total average number of revisions per user, 4) Total number of revisions by non-registered users, 5) Total number of revisions made from mobile devices, and 6) Total average size of revisions. Through careful statistical analysis to determine the standard behavior of these metrics, malicious revisions can be identified by a deviation from these standards.

To combat spam, the authors developed the "Wikipedia Spammer Detector" (WiSDe). WiSDe uses a framework built upon features that research has revealed to be typical of spammers. These features most notably include the size of the edits, the time required to make edits, and the ratio of links to text within the edits. WiSDe achieved an 80.8% accuracy on a dataset of 4.2K users and 75.6K edits - an improvement of 11.1% over ORES. The case study concludes by providing some findings regarding the retention of new contributors to Wikipedia. They proposed a predictive model that achieved a high precision (0.99) in predicting users that would become inactive. This model relies on the observation that active users are more involved in edit wars, edit a wider variety of categories, and positively accept critiques.

See also our earlier coverage of related papers involving the first author: "Detecting Pages to Protect", "Spam Users Identification in Wikipedia Via Editing Behavior"

Editors successfully signal their intelligence by writing high-quality articles - but only when contributing non-anonymously

Reviewed by Tilman Bayer
Peacocks are a well-known example of signalling

An article[2] in the psychology journal Personality and Individual Differences reports on an experiment in a Wikipedia-like wiki, where editors with higher general intelligence scores write higher quality articles (as rated by readers) - but only when contributing non-anonymously. This is interpreted as evidence that contributors successfully "signal" their intelligence to readers (in the sense of signalling theory, which seeks to explain various behaviours in humans and animals that appear to have no direct benefit to the actor by positing that they serve to communicate certain traits or states to observers in an "honest", i.e. difficult to fake fashion).

The authors start out by wondering (like many have before) why "some people share knowledge online, often without tangible compensation", on sites such as Wikipedia, Reddit or YouTube. "Many contributions appear to be unconditionally altruistic and the system vulnerable to free riding. If the selfish gene hypothesis is correct, however, altruism must be apparent and compensated with fitness benefits. As such, our findings add to previous work that tests the costly signaling theory explanations for altruism." (Notably, not all researchers share this assumption about altruistic motivations, see e.g. the preprint by Pinto et al. listed below.)

An IQ test item in the style of a Raven's Progressive Matrices test. Given eight patterns, the subject must identify the missing ninth pattern

For the experiment, 98 undergraduate students, who had previously completed the Raven's Advanced Progressive Matrices (RPM) intelligence test, were asked to spend 30 minutes "to contribute to an ostensibly real wiki-style encyclopedia being created by the Department of Communication. Participants were told that the wiki would serve as a repository of information for incoming first-year students and that it would contain entries related to campus life, culture, and academics [...] The wiki resembled Wikipedia and contained a collection of preliminary articles." 38 of the participants were told their contributions would remain anonymous, whereas another 40 "were photographed and told that their photo would be placed next to their contribution", and their names were included with their contribution. (Curiously, the paper doesn't specify the treatment of the remaining 20 participants.) "The quality of all participants' contributions was rated by four undergraduate research assistants who were blind to hypotheses and experimental conditions. [...] The research assistants also judged the contributors' intelligence relative to other participants using a 7-point Likert-type scale (1 Much dumber than average, 7 Much smarter than average)".

The researchers "found that as individuals' scores on Ravens Progressive Matrices (RPM) increased, participants were judged to have written better quality articles, but only when identifiable and not when anonymous. Further, the effect of RPM scores on inferred intelligence was mediated by article quality, but only when signalers were identifiable." They note that their results leave several "important questions" still open, e.g. that "it remains unclear what benefits are gained by signalers who contribute to information pools." Citing previous research, they "doubt a direct relationship to reproductive success for altruism in signaling g in information pools. Technical abilities are not particularly sexually attractive (Kaufman et al., 2014), so it is likely that g mediates indirect fitness benefits in such contexts." It might be worth noting that the study's convenience sample likely differs in its demographics from those of Wikipedia editors, e.g. only 28 of the 98 participating students were male, whereas males are well known to form the vast majority of Wikipedia contributors.

The article is an important contribution to the existing body of literature on Wikipedia editors' motivations to contribute, even if it appears to be curiously unaware of it (none of the cited references contain "Wikipedia" or "wiki" in their title).


Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

Compiled by Tilman Bayer

6.7% of Wikipedia articles cite at least one academic journal article with DOI

From the abstract:[3]

"we release Wikipedia Citations, a comprehensive dataset of citations extracted from Wikipedia. A total of 29.3M citations were extracted from 6.1M English Wikipedia articles as of May 2020, and classified as being to books, journal articles or Web contents. We were thus able to extract 4.0M citations to scholarly publications with known identifiers -- including DOI, PMC, PMID, and ISBN -- and further labeled an extra 261K citations with DOIs from Crossref. As a result, we find that 6.7% of Wikipedia articles cite at least one journal article with an associated DOI. Scientific articles cited from Wikipedia correspond to 3.5% of all articles with a DOI currently indexed in the Web of Science."

"Science through Wikipedia: A novel representation of open knowledge through co-citation networks"

From the abstract:[4]

"... the sample was reduced to 847 512 references made by 193 802 Wikipedia articles to 598 746 scientific articles belonging to 14 149 journals indexed in Scopus. As highlighted results we found a significative presence of 'Medicine' and 'Biochemistry, Genetics and Molecular Biology' papers and that the most important journals are multidisciplinary in nature, suggesting also that high-impact factor journals were more likely to be cited. Furthermore, only 13.44% of Wikipedia citations are to Open Access journals."

See also earlier by some of the same authors: "Mapping the backbone of the Humanities through the eyes of Wikipedia"

"Quantifying Engagement with Citations on Wikipedia"

From the abstract:[5]

"... we built client-side instrumentation for logging all interactions with links leading from English Wikipedia articles to cited references during one month, and conducted the first analysis of readers’ interactions with citations. We find that overall engagement with citations is low: about one in 300 page views results in a reference click (0.29% overall; 0.56% on desktop; 0.13% on mobile). [...] clicks occur more frequently on shorter pages and on pages of lower quality, suggesting that references are consulted more commonly when Wikipedia itself does not contain the information sought by the user. Moreover, we observe that recent content, open access sources, and references about life events (births, deaths, marriages, etc.) are particularly popular."

See also the research project page on Meta-wiki, and a video recording and slides of a presentation in the June 2020 Wikimedia Research Showcase

Presentation slide illustrating the instrumentation of reader interactions with citations

"Individual Factors that Influence Effort and Contributions on Wikipedia"

From the abstract and paper:[6]

"... [We] surveyed [Portuguese Wikipedia] community members and collected secondary data. After excluding outliers, we obtained a final sample with 212 participants. We applied exploratory factor analysis and structural equation modeling, which resulted in a model with satisfactory fit indices. The results indicate that effort influences active contributions, and attitude, altruism by reputation, and altruism by identification influence effort. None of the proposed factors are directly related to active contributions. Experience directly influences self-efficacy while it positively moderates the relation between effort and active contributions. [...] To reach [editors registered on Portuguese Wikipedia], we sent questionnaires to Wikimedia Brasil’s e-mail lists, made an announcement in Wikipedia’s notice section, and sent private messages to members through the platform itself."

"Approaches to Understanding Indigenous Content Production on Wikipedia"

From the abstract:[7]

"We examine pages with geotagged content in English Wikipedia in four categories, places with Indigenous majorities (of any size), Rural places, Urban Clusters, and Urban areas. We find significant differences in quality and editor attention for articles about places with Native American majorities, as compared to other places."

"Tabouid: a Wikipedia-based word guessing game"

This article describes the automatic generation of a Taboo-like game (where players have to describe a word while avoiding a given set of other words), also released as a free mobile app for Android and iOS. From the abstract:[8]

"We present Tabouid, a word-guessing game automatically generated from Wikipedia. Tabouid contains 10,000 (virtual) cards in English, and as many in French, covering not only words and linguistic expressions but also a variety of topics including artists, historical events or scientific concepts. Each card corresponds to a Wikipedia article, and conversely, any article could be turned into a card. A range of relatively simple NLP and machine-learning techniques are effectively integrated into a two-stage process. "

"Vandalism Detection in Crowdsourced Knowledge Bases"

From the abstract:[9]

"In this thesis, we [...] develop novel machine learning-based vandalism detectors to reduce the manual reviewing effort [on Wikidata]. To this end, we carefully develop large-scale vandalism corpora, vandalism detectors with high predictive performance, and vandalism detectors with low bias against certain groups of editors. We extensively evaluate our vandalism detectors in a number of settings, and we compare them to the state of the art represented by the Wikidata Abuse Filter and the Objective Revision Evaluation Service by the Wikimedia Foundation. Our best vandalism detector achieves an area under the curve of the receiver operating characteristics of 0.991, significantly outperforming the state of the art; our fairest vandalism detector achieves a bias ratio of only 5.6 compared to values of up to 310.7 of previous vandalism detectors. Overall, our vandalism detectors enable a conscious trade-off between predictive performance and bias and they might play an important role towards a more accurate and welcoming web in times of fake news and biased AI systems."

"SchemaTree: Maximum-Likelihood Property Recommendation for Wikidata"

From the abstract:[10]

"We introduce a trie-based method that can efficiently learn and represent property set probabilities in RDF graphs. [...] We investigate how the captured structure can be employed for property recommendation, analogously to the Wikidata PropertySuggester. We evaluate our approach on the full Wikidata dataset and compare its performance to the state-of-the-art Wikidata PropertySuggester, outperforming it in all evaluated metrics. Notably we could reduce the average rank of the first relevant recommendation by 71%."

NPOV prevails in Hindi, Urdu, and English Wikipedia articles about the Jammu and Kashmir conflict

From the abstract:[11]

"This article asks to what degree Wikipedia articles in three languages --- Hindi, Urdu, and English --- achieve Wikipedia's mission of making neutrally-presented, reliable information on a polarizing, controversial topic available to people around the globe. We chose the topic of the recent revocation of Article 370 of the Constitution of India, which, along with other recent events in and concerning the region of Jammu and Kashmir, has drawn attention to related articles on Wikipedia. This work focuses on the English Wikipedia, being the preeminent language edition of the project, as well as the Hindi and Urdu editions. [...] We analyzed page view and revision data for three Wikipedia articles [on the English Wikipedia, these were Kashmir conflict, Article 370 of the Constitution of India, and Insurgency in Jammu and Kashmir ]. Additionally, we interviewed editors from all three Wikipedias to learn differences in editing processes and motivations. [...] In Hindi and Urdu, as well as English, editors predominantly adhere to the principle of neutral point of view (NPOV), and these editors quash attempts by other editors to push political agendas."

See also the authors' conference poster


  1. ^ Spezzano, Francesca; Gurunathan, Indhumathi (2020). "Protecting the Web from Misinformation". In Mohammad A. Tayebi; Uwe Glässer; David B. Skillicorn (eds.). Open Source Intelligence and Cyber Crime: Social Media Analytics. Lecture Notes in Social Networks. Cham: Springer International Publishing. pp. 1–27. ISBN 9783030412517. Closed access icon
  2. ^ Yoder, Christian N.; Reid, Scott A. (2019-10-01). "The quality of online knowledge sharing signals general intelligence". Personality and Individual Differences. 148: 90–94. doi:10.1016/j.paid.2019.05.013. ISSN 0191-8869. Closed access icon
  3. ^ Singh, Harshdeep; West, Robert; Colavizza, Giovanni (2020-07-14). "Wikipedia Citations: A comprehensive dataset of citations with identifiers extracted from English Wikipedia". arXiv:2007.07022 [cs]. Dataset
  4. ^ Arroyo-Machado, Wenceslao; Torres-Salinas, Daniel; Herrera-Viedma, Enrique; Romero-Frías, Esteban (2020-02-10). "Science through Wikipedia: A novel representation of open knowledge through co-citation networks". PLOS ONE. 15 (2): –0228713. doi:10.1371/journal.pone.0228713. ISSN 1932-6203.
  5. ^ Piccardi, Tiziano; Redi, Miriam; Colavizza, Giovanni; West, Robert (2020-04-20). "Quantifying Engagement with Citations on Wikipedia". Proceedings of The Web Conference 2020. WWW '20. New York, NY, USA: Association for Computing Machinery. pp. 2365–2376. doi:10.1145/3366423.3380300. ISBN 9781450370233. Closed access icon Author's copy
  6. ^ Pinto, Luiz F.; Santos, Carlos Denner dos; Onoyama, Silvia (2020-07-14). "Individual Factors that Influence Effort and Contributions on Wikipedia". arXiv:2007.07333 [cs].
  7. ^ Sethuraman, Manasvini; Grinter, Rebecca E.; Zegura, Ellen (2020-06-15). "Approaches to Understanding Indigenous Content Production on Wikipedia". Proceedings of the 3rd ACM SIGCAS Conference on Computing and Sustainable Societies. COMPASS '20. Ecuador: Association for Computing Machinery. pp. 327–328. doi:10.1145/3378393.3402249. ISBN 9781450371292. Closed access icon
  8. ^ Bernard, Timothée (July 2020). "Tabouid: a Wikipedia-based word guessing game". Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. Online: Association for Computational Linguistics. pp. 24–29. doi:10.18653/v1/2020.acl-demos.4.
  9. ^ Heindorf, Stefan (2019). Vandalism Detection in Crowdsourced Knowledge Bases (Thesis). Paderborn, Germany: Paderborn University. S2CID 209517598. (dissertation)
  10. ^ Gleim, Lars C.; Schimassek, Rafael; Hüser, Dominik; Peters, Maximilian; Krämer, Christoph; Cochez, Michael; Decker, Stefan (2020). "SchemaTree: Maximum-Likelihood Property Recommendation for Wikidata". In Andreas Harth; Sabrina Kirrane; Axel-Cyrille Ngonga Ngomo; Heiko Paulheim; Anisa Rula; Anna Lisa Gentile; Peter Haase; Michael Cochez (eds.). The Semantic Web. Lecture Notes in Computer Science. Cham: Springer International Publishing. pp. 179–195. doi:10.1007/978-3-030-49461-2_11. ISBN 9783030494612.
  11. ^ Hickman, Molly G.; Pasad, Viral; Sanghavi, Harsh; Thebault-Spieker, Jacob; Lee, Sang Won (2020-06-17). "Wiki HUEs: Understanding Wikipedia practices through Hindi, Urdu, and English takes on evolving regional conflict". Proceedings of the 2020 International Conference on Information and Communication Technologies and Development. ICTD2020. Guayaquil, Ecuador: Association for Computing Machinery. pp. 1–5. doi:10.1145/3392561.3397586. ISBN 9781450387620. Closed access icon

In this issue
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
There exists an entire scholarly industry - going back more than a century - debating the concept of general intelligence, how or whether it can be quantified, what the biases of such tests might be etc. In the review, I linked to the articles about the journal and about g, as pathways for readers to learn more about the field's background.
That said, the experiment found actual evidence for a relation between intelligence as measured by these tests, and article quality as rated by readers unaware of the editors' scores. In other words, the researchers' result contradicts your conclusion that these scores are as unrelated to the article-writing task as "skull shapes". (And I guess they would not dispute that there could be other relevant dimensions, such as emotional intelligence - a concept and measure that of course has attracted its own share of criticism and validity concerns, or that their experiment setup did not simulate every possible aspect of editing Wikipedia, like AfD discussions.)
"I suggest that a more useful study would be to simply ask editors what their motivation is" - several such studies have been done, see the link in the review.
Regards, HaeB (talk) 17:10, 4 September 2020 (UTC)[reply]
I wasn't meaning to criticise your summary of the study, per se. I hope my comment is clear that I am no professional in this area but as a layperson with an interest, I am aware of the g factor and some of the other topics you mention. I think you have focused too specifically on the exact wording I used (including the mention of phrenology, which was hyperbole for comedic effect rather than a literal "conclusion") and less broadly on the idea that researchers' implicit preconceptions regarding intelligence and people acting out of self-interest influence results that are published (given such things as false positives and publication bias). I am aware of studies similar to that which I suggested but I believe, given the current replication crisis, another study in the same direction would be more useful than one that is quite far removed from the way Wikipedia works in practice. — Bilorv (talk) 19:13, 4 September 2020 (UTC)[reply]


The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0