A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
This book chapter  discusses general trends in misinformation on the web. Misinformation can take many forms including vandalism, spam, rumors, hoaxes, counterfeit websites, fake product reviews, clickbait, and fake news. The chapter briefly describes each subtopic and presents examples of them in practice. The following section details a comprehensive set of NLP and network analysis studies that have been conducted both gain further insight into each subtopic, as well as combat them.
The chapter concludes with a case study based on the authors' research to protect Wikipedia content quality. The open editing mechanism of Wikipedia is ripe for exploitation by bad actors. This occurs mainly by vandalism, but also through page spamming and the dissemination of false information. To combat vandalism, the authors developed the "DePP" system, which is a tool for detecting which Wikipedia article pages to protect. DePP achieves 92.1% accuracy across multiple languages in this task. This system is based on the following base features: 1) Total average time between revisions, 2) Total number of users making five or more revisions, 3) Total average number of revisions per user, 4) Total number of revisions by non-registered users, 5) Total number of revisions made from mobile devices, and 6) Total average size of revisions. Through careful statistical analysis to determine the standard behavior of these metrics, malicious revisions can be identified by a deviation from these standards.
To combat spam, the authors developed the "Wikipedia Spammer Detector" (WiSDe). WiSDe uses a framework built upon features that research has revealed to be typical of spammers. These features most notably include the size of the edits, the time required to make edits, and the ratio of links to text within the edits. WiSDe achieved an 80.8% accuracy on a dataset of 4.2K users and 75.6K edits - an improvement of 11.1% over ORES. The case study concludes by providing some findings regarding the retention of new contributors to Wikipedia. They proposed a predictive model that achieved a high precision (0.99) in predicting users that would become inactive. This model relies on the observation that active users are more involved in edit wars, edit a wider variety of categories, and positively accept critiques.
See also our earlier coverage of related papers involving the first author: "Detecting Pages to Protect", "Spam Users Identification in Wikipedia Via Editing Behavior"
An article in the psychology journal Personality and Individual Differences reports on an experiment in a Wikipedia-like wiki, where editors with higher general intelligence scores write higher quality articles (as rated by readers) - but only when contributing non-anonymously. This is interpreted as evidence that contributors successfully "signal" their intelligence to readers (in the sense of signalling theory, which seeks to explain various behaviours in humans and animals that appear to have no direct benefit to the actor by positing that they serve to communicate certain traits or states to observers in an "honest", i.e. difficult to fake fashion).
The authors start out by wondering (like many have before) why "some people share knowledge online, often without tangible compensation", on sites such as Wikipedia, Reddit or YouTube. "Many contributions appear to be unconditionally altruistic and the system vulnerable to free riding. If the selfish gene hypothesis is correct, however, altruism must be apparent and compensated with fitness benefits. As such, our findings add to previous work that tests the costly signaling theory explanations for altruism." (Notably, not all researchers share this assumption about altruistic motivations, see e.g. the preprint by Pinto et al. listed below.)
For the experiment, 98 undergraduate students, who had previously completed the Raven's Advanced Progressive Matrices (RPM) intelligence test, were asked to spend 30 minutes "to contribute to an ostensibly real wiki-style encyclopedia being created by the Department of Communication. Participants were told that the wiki would serve as a repository of information for incoming first-year students and that it would contain entries related to campus life, culture, and academics [...] The wiki resembled Wikipedia and contained a collection of preliminary articles." 38 of the participants were told their contributions would remain anonymous, whereas another 40 "were photographed and told that their photo would be placed next to their contribution", and their names were included with their contribution. (Curiously, the paper doesn't specify the treatment of the remaining 20 participants.) "The quality of all participants' contributions was rated by four undergraduate research assistants who were blind to hypotheses and experimental conditions. [...] The research assistants also judged the contributors' intelligence relative to other participants using a 7-point Likert-type scale (1 Much dumber than average, 7 Much smarter than average)".
The researchers "found that as individuals' scores on Ravens Progressive Matrices (RPM) increased, participants were judged to have written better quality articles, but only when identifiable and not when anonymous. Further, the effect of RPM scores on inferred intelligence was mediated by article quality, but only when signalers were identifiable." They note that their results leave several "important questions" still open, e.g. that "it remains unclear what beneﬁts are gained by signalers who contribute to information pools." Citing previous research, they "doubt a direct relationship to reproductive success for altruism in signaling g in information pools. Technical abilities are not particularly sexually attractive (Kaufman et al., 2014), so it is likely that g mediates indirect ﬁtness beneﬁts in such contexts." It might be worth noting that the study's convenience sample likely differs in its demographics from those of Wikipedia editors, e.g. only 28 of the 98 participating students were male, whereas males are well known to form the vast majority of Wikipedia contributors.
The article is an important contribution to the existing body of literature on Wikipedia editors' motivations to contribute, even if it appears to be curiously unaware of it (none of the cited references contain "Wikipedia" or "wiki" in their title).
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
From the abstract:
"we release Wikipedia Citations, a comprehensive dataset of citations extracted from Wikipedia. A total of 29.3M citations were extracted from 6.1M English Wikipedia articles as of May 2020, and classified as being to books, journal articles or Web contents. We were thus able to extract 4.0M citations to scholarly publications with known identifiers -- including DOI, PMC, PMID, and ISBN -- and further labeled an extra 261K citations with DOIs from Crossref. As a result, we find that 6.7% of Wikipedia articles cite at least one journal article with an associated DOI. Scientific articles cited from Wikipedia correspond to 3.5% of all articles with a DOI currently indexed in the Web of Science."
From the abstract:
"... the sample was reduced to 847 512 references made by 193 802 Wikipedia articles to 598 746 scientific articles belonging to 14 149 journals indexed in Scopus. As highlighted results we found a significative presence of 'Medicine' and 'Biochemistry, Genetics and Molecular Biology' papers and that the most important journals are multidisciplinary in nature, suggesting also that high-impact factor journals were more likely to be cited. Furthermore, only 13.44% of Wikipedia citations are to Open Access journals."
See also earlier by some of the same authors: "Mapping the backbone of the Humanities through the eyes of Wikipedia"
From the abstract:
"... we built client-side instrumentation for logging all interactions with links leading from English Wikipedia articles to cited references during one month, and conducted the first analysis of readers’ interactions with citations. We find that overall engagement with citations is low: about one in 300 page views results in a reference click (0.29% overall; 0.56% on desktop; 0.13% on mobile). [...] clicks occur more frequently on shorter pages and on pages of lower quality, suggesting that references are consulted more commonly when Wikipedia itself does not contain the information sought by the user. Moreover, we observe that recent content, open access sources, and references about life events (births, deaths, marriages, etc.) are particularly popular."
From the abstract and paper:
"... [We] surveyed [Portuguese Wikipedia] community members and collected secondary data. After excluding outliers, we obtained a final sample with 212 participants. We applied exploratory factor analysis and structural equation modeling, which resulted in a model with satisfactory fit indices. The results indicate that effort influences active contributions, and attitude, altruism by reputation, and altruism by identification influence effort. None of the proposed factors are directly related to active contributions. Experience directly influences self-efficacy while it positively moderates the relation between effort and active contributions. [...] To reach [editors registered on Portuguese Wikipedia], we sent questionnaires to Wikimedia Brasil’s e-mail lists, made an announcement in Wikipedia’s notice section, and sent private messages to members through the platform itself."
From the abstract:
"We examine pages with geotagged content in English Wikipedia in four categories, places with Indigenous majorities (of any size), Rural places, Urban Clusters, and Urban areas. We find significant differences in quality and editor attention for articles about places with Native American majorities, as compared to other places."
This article describes the automatic generation of a Taboo-like game (where players have to describe a word while avoiding a given set of other words), also released as a free mobile app for Android and iOS. From the abstract:
"We present Tabouid, a word-guessing game automatically generated from Wikipedia. Tabouid contains 10,000 (virtual) cards in English, and as many in French, covering not only words and linguistic expressions but also a variety of topics including artists, historical events or scientific concepts. Each card corresponds to a Wikipedia article, and conversely, any article could be turned into a card. A range of relatively simple NLP and machine-learning techniques are effectively integrated into a two-stage process. "
From the abstract:
"In this thesis, we [...] develop novel machine learning-based vandalism detectors to reduce the manual reviewing effort [on Wikidata]. To this end, we carefully develop large-scale vandalism corpora, vandalism detectors with high predictive performance, and vandalism detectors with low bias against certain groups of editors. We extensively evaluate our vandalism detectors in a number of settings, and we compare them to the state of the art represented by the Wikidata Abuse Filter and the Objective Revision Evaluation Service by the Wikimedia Foundation. Our best vandalism detector achieves an area under the curve of the receiver operating characteristics of 0.991, significantly outperforming the state of the art; our fairest vandalism detector achieves a bias ratio of only 5.6 compared to values of up to 310.7 of previous vandalism detectors. Overall, our vandalism detectors enable a conscious trade-off between predictive performance and bias and they might play an important role towards a more accurate and welcoming web in times of fake news and biased AI systems."
From the abstract:
"We introduce a trie-based method that can efficiently learn and represent property set probabilities in RDF graphs. [...] We investigate how the captured structure can be employed for property recommendation, analogously to the Wikidata PropertySuggester. We evaluate our approach on the full Wikidata dataset and compare its performance to the state-of-the-art Wikidata PropertySuggester, outperforming it in all evaluated metrics. Notably we could reduce the average rank of the first relevant recommendation by 71%."
From the abstract:
"This article asks to what degree Wikipedia articles in three languages --- Hindi, Urdu, and English --- achieve Wikipedia's mission of making neutrally-presented, reliable information on a polarizing, controversial topic available to people around the globe. We chose the topic of the recent revocation of Article 370 of the Constitution of India, which, along with other recent events in and concerning the region of Jammu and Kashmir, has drawn attention to related articles on Wikipedia. This work focuses on the English Wikipedia, being the preeminent language edition of the project, as well as the Hindi and Urdu editions. [...] We analyzed page view and revision data for three Wikipedia articles [on the English Wikipedia, these were Kashmir conflict, Article 370 of the Constitution of India, and Insurgency in Jammu and Kashmir ]. Additionally, we interviewed editors from all three Wikipedias to learn differences in editing processes and motivations. [...] In Hindi and Urdu, as well as English, editors predominantly adhere to the principle of neutral point of view (NPOV), and these editors quash attempts by other editors to push political agendas."
See also the authors' conference poster