A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
A paper in The Economic Journal, titled "Public Good Superstars: a Lab-in-the-Field Study of Wikipedia",[1] presents results from a nine-year (2011–2020) study of the motivations and contributions of English Wikipedia editors. From the abstract:
Over 9 consecutive years, we study the relationship between social preferences – reciprocity, altruism, and social image – and field cooperation. Wikipedia editors are quite prosocial on average, and superstars even more so. But while reciprocal and social image preferences strongly relate to contribution quantity among casual editors, only social image concerns continue to predict differences in contribution levels between superstars. In addition, we find that social image driven editors – both casual and superstars – contribute lower quality content on average. Evidence points to a perverse social incentive effect, as quantity is more readily observable than quality on Wikipedia."
The study operationalizes these concepts using data from several sources. The sample consists of 730 English Wikipedia editors who volunteered to participate in a 2011 online survey and experiment designed to gauge their reciprocity and altruism: Participants were classified as "free-rider", "weak reciprocator", "reciprocator" and "altruist" according to their decisions in a public goods game. A topline result here indicates, perhaps unsurprisingly, that among Wikipedians there are fewer free-riders and more altruists than usual:
[...] the overwhelming majority of our subjects behave either as full or weak reciprocators (38 and 47%, respectively). The proportion of free-riders (about 7% in our data) does appear lower than the proportion of 20-30% usually obtained with more standard subject pools, however. Similarly, more subjects behave as pure altruists in our data (about 9%).
Furthermore, the paper uses the concept of "superstar contributors", defined generally as "highly regarded community members with impressive contribution records"
, and operationalized in case of Wikipedia as editors who have received a barnstar. Among these, the editors who chose to display at least one such award on their user page are classified as "social signalers." (More precisely, the authors try to control for the fact that editors who contribute more may be more likely to display a barnstar simply because they are more likely to have received one – e.g. by taking into account the size of the editor's user page and the total number of barnstars received.)
The authors had already used this data in some publications which we covered here back in 2013 ("What drives people to contribute to Wikipedia? Experiment suggests reciprocity and social image motivations"). In the new paper, they also look these 730 editors' contributions over the period from 2011 to 2020, specifically
how likely editors are to delete (i.e., “revert”) the contributions of others without providing an explanation [...] Wikipedia contributors typically consider non justified reverts as highly uncooperative and harmful to the project.
Among other results, the authors
uncovered a surprising negative correlation between our measures of contribution quantity and quality at the editor level. Namely, the social signalers in our data, if they contribute significantly more content to Wikipedia, also contribute lower quality material on average. In practice, this means that, as vetted by their peers, social signalers contribute content that persists about 38% less revisions on average.
Two of several "interesting patterns" highlighted by the authors concern editors' age and education level (two of the demographic variables from the 2011 survey):
older editors appear more cooperative by two of our measures: (i) they tend to contribute significantly more content [...], and (ii) they are less likely to leave their reverts unexplained [...]
[..] editors’ level of education is strongly associated with the quality of their edits [...]. Out of an 8-points scale, each additional degree level yields an average increase of 6% in content persistence. This represents a sizeable number: all else equal, an editor moving from the lowest education level in our data (i.e., who did not complete high school), to the highest (i.e., earned a PhD), would thus see the persistence of their contributions increase by 48% on average.
From the article:[2]
"In this overview, we will discuss how to go about creating or editing an article on a mathematical subject. [...] We will also discuss biographies of mathematicians, articles on mathematical books, and the social dynamics of the Wikipedia editor community."
The authors (all experienced Wikipedia editors) aptly cover various misunderstandings and pitfalls that academic mathematicians might encounter when contributing to Wikipedia. (For example, the "Writing About Your Own Work" section advises that "Rather than advertising their own super-specialization, experts can make themselves useful by explaining the prerequisites to understanding it. What articles would a student read in order to understand the background and broader context of your research?"
). Somewhat ironically, the paper's first paragraph illustrates one such tension between the conventions of academia and Wikipedia:
This essay incorporates with permission material from our pseudonymous colleague XOR'easter,[supp 1] who also contributed many suggestions during the writing process. By the extent of XOR’easter’s contributions, they would normally be credited as an author. However it was not possible in time to find a way to strictly preserve anonymity and assign legal copyright. All four contributors disagree with this exclusion. I regret its necessity — Ed.
The paper's title includes a rather cringe-y pun referring to the Principia Mathematica.
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
From the abstract:[3]
we introduce the SustainPedia dataset, which compiles data from over 40K Wikipedia articles, including each article's sustainable success label and more than 300 explanatory features such as edit history, user experience, and team composition. Using this dataset, we develop machine learning models to predict the sustainable success of Wikipedia articles. Our best-performing model achieves a high AU-ROC score of 0.88 on average. Our analysis reveals important insights. For example, we find that the longer an article takes to be recognized as high-quality, the more likely it is to maintain that status over time (i.e., be sustainable). Additionally, user experience emerged as the most critical predictor of sustainability."
From the abstract:[4]
"Using a large-scale examination of publicly available data, we assessed whether species across 6 taxonomic groups received more page views on Wikipedia when the species was named after a celebrity than when it was not. We conducted our analysis for 4 increasingly strict thresholds of how many average daily Wikipedia page views a celebrity had (1, 10, 100, or 1000 views). Overall, we found a high probability (0.96–0.98) that species named after celebrities had more page views than their closest relatives that were not named after celebrities, irrespective of the celebrity threshold."
From the abstract:[5]
"This qualitative discourse analysis of editors' debates around climate change on Wikipedia argues that their hesitancy to 'declare crisis' is not a conscious editorial choice as much as an outcome of a friction between the folk philosophy of science Wikipedia is built upon, editors' own sense of urgency, and their anticipations about audience uptake of their writing. This friction shapes a group style that fosters temporal ambiguity. Hence, the findings suggest that in the [English] Wikipedia entry on climate change, platform affordances and contestation of expertise foreclose a declaration of climate crisis."
Discuss this story