The Signpost

Recent research

Gender gap and conflict aversion; collaboration on breaking news; effects of leadership on participation; legacy of Public Policy Initiative

Contribute  —  
Share this
By Tilman Bayer, Piotr Konieczny, Jodi.a.schneider, Hfordsa and Dario Taraborelli
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, edited jointly with the Wikimedia Research Committee and republished as the Wikimedia Research Newsletter.

Wikipedia research at CSCW 2012

The annual 15th ACM conference on computer-supported cooperative work (CSCW 2012) featured two sessions about Wikipedia Studies. The first one was titled "Scaling our Everest" (in amusing contrast to an earlier metaphor for the role of Wikipedia in that field of research: "the fruit fly of social software"), and covered four papers. A second session likewise comprised four papers and notes. Below are some of the highlights from these two sessions.

Gender gap connected to conflict aversion and lower confidence among women

The Gender Gap hub on Meta.

Since January 2011, Wikipedia's "Gender gap" has received much attention from Wikimedians, researchers and the media – triggered by a New York Times article that cited the estimate that only 12.64% of Wikipedia contributors are female. That figure came from the 2010 UNU-MERIT study, which was based on the first global, general survey of Wikipedia users, conducted in 2008 with 176,192 respondents using a methodology that had raised some questions (e.g. about sample bias and selection bias), but other studies found similarly low ratios. A new paper titled "Conflict, Confidence, or Criticism: An Empirical Examination of the Gender Gap in Wikipedia"[1] has now delved further into the data of the UNU-MERIT study, examining the responses to questions such as "Why don't you contribute to Wikipedia?" and "Why did you stop contributing to Wikipedia?", finding strong support for the following three hypotheses:

A fourth hypothesis likewise tested a conjecture that has been brought up several times in discussion about Wikipedia's gender gap:

However, the paper's authors argued that this conjecture was not borne out by the data, instead finding that "men are 19% more likely to select 'I didn't have time to go on' as a reason for no longer contributing."

Making sense of NPOV

A paper titled "From Individual Minds to Social Structures: The Structuring of an Online Community as a Collective–Sensemaking Process" [2] looks at how Wikipedia editors talked about the Neutral point of view (NPOV) policy in the period of July 2005 to January 29, 2006, using Karl Weick's model of sensemaking and Anthony Giddens' theory of structuration for its theoretical approach. The paper's focus was on "how individual sensemaking efforts turn into interacts"; in other words, trying to understand how editors came to understand the NPOV policy through examining their posts. Editors' posts were differentiated into three types of questions (asking clarificatory questions, asking about behavior and the rules, and using questions as rhetorical devices) and answers (offering interpretation, explanation to others, and explanation to oneself).

Public Policy Initiative motivated students to become Wikipedians

Wikimedia Director Sue Gardner shows 11 packs of printer paper: the equivalent volume of content produced by students in the Wikipedia Public Policy Initiative.

In a paper titled "Classroom Wikipedia participation effects on future intentions to contribute"[3] (presentation slides), five Michigan-based researchers looked at a sample of over 400 students who were involved in a pilot of the WMF education initiative (87% of whom were native speakers of English), and asked how likely the student-editors were to be become real editors after the end of their class projects, and what the relevant factors in such conversions are. They find that the student retention ratio is higher than the average editor retention ratio (while only 0.0002% of editors who make one edit become regulars, about 4% of students have made edits after their course ended). About 75% of the students preferred the Wikipedia assignment to a regular one, and major reasons for their enjoyment included the level of engagement in class, an appreciation of global visibility of the article, and the exposure to social media.

In related news, Erik Olin Wright, president of the American Sociological Association (ASA) who last year announced the organization's "Wikipedia Initiative", posted an overview[4] of a graduate seminar he conducted with a Wikipedia component. The students had to review a book, and use their newly gained knowledge to expand a relevant article on Wikipedia. In his assessment, Wright called the activity a "great success" and encouraged others to engage in similar activities.

High-tempo contributions: Who edits breaking news articles?

Editor-focused and article-focused interactions emerging from a study of high-tempo collaboration in the English Wikipedia, Keegan et al. (2012).[5]
A team based at Northwestern University studied how topics of a specific nature find matching contributors in Wikipedia, or more precisely: "how editors with particular skills self-organize around articles requiring different forms of collaboration". The study[5] focused on the case of co-authorship in the context of breaking news articles. The authors note that such articles pose an interesting paradox: those that undergo a high-tempo editing cycle involving multiple contributors at once typically manifest quality issues, as the increased cost of interaction inhibits quality improvement work, yet in the unique case of breaking news articles, quality tends to remain very high despite multiple contributors attempting to make simultaneous edits with incomplete information or poor coordination.
The study uses revision data describing 58,500 contributions from 14,292 editors to 249 English Wikipedia articles about commercial airline disasters and represents them as a bipartite network characterized as article and editor nodes. A statistical model (p*/ERGM) is applied to estimate the likelihood of the creation of a link between a pair of nodes as a function of specific network properties or node attributes. The analysis focuses both on attributes of each set of nodes (e.g. whether an article is "breaking news", or the number of editor contributions) as well as properties of article-editor pairs as illustrated in the figure (at right). Some of the main results of the study were:

How different kinds of leadership messages increase or decrease participation

Centralized or shared leadership?

Three social computing researchers from Carnegie Mellon University measured the "Effectiveness of Shared Leadership"[6] on the English Wikipedia – a model where leadership is not restricted to a few community members in a specialized role, but rather distributed among many. In an earlier paper (reviewed in a previous report), they had found evidence for shared leadership from an analysis of four million user talk page messages from a January 2008 dump of the English Wikipedia, classifying them (using machine learning) into four kinds of behavior indicating different kinds of "leadership": "transactional leadership" (positive feedback), "aversive leadership" (negative feedback), "directive leadership" (providing instructions) and "person-focused leadership" (indicated by "greeting words and smiley emoticons"). Based on this data, the present paper examines whether these four forms of messages increase or decrease the edit frequency of the user who receives them, also taking into account whether the message comes from an administrator or a non-administrator. Their first conclusion is that messages sent by both kinds of editors "significantly influenced other members’ motivation", and secondly, they found that "transactional leaders and person-focused leaders were effective in motivating others, whereas aversive leaders' transactional and person-based leadership had the strongest effects, suggesting that interfaces and mechanisms that make it easier for editors to connect with, reward, and express their appreciation for each other may have the greatest benefits." (The sample predates the introduction of the "WikiLove" software extension which has exactly this goal.) Addressing a common objection by active Wikipedians in defense of warning messages, they acknowledge that "[p]eople may argue that reducing the activity of harmful editors is a positive impact of aversive leadership. However, considering the fact that there is much work to be accomplished in Wikipedia and the recent downward trend of active editors, pure aversive leadership should be avoided." The paper did not attempt to measure the quality of the work of the message recipients.

The researchers had to use a technique called propensity score matching to address the difficulty that true experimentation – for instance, separating users into control groups – was not possible in this purely observational approach. However, they separately examined the case of Betacommandbot, who had sent "more than half of the messages categorized as aversive leadership" in the sample, warning users who had uploaded a non-free image without a valid fair use rationale. Because these messages had been sent to editors regardless of whether their contributions were in violation of policy at the time they were made, "the Betacommandbot warning was a natural experiment, like a change in speeding laws, that was not induced by recipients’ behavior". The effect of this warning was to decrease the recipients' edits by more than 10%.

Other CSCW 2012 contributions

Wikipedia discourse on Europe analyzed

A master thesis by Dušan Miletić on Europe According to English Wikipedia: Open-sourcing the Discourse on Europe[12] looks at the nature of the discourse on Europe in the English Wikipedia, employing Foucauldian discourse analysis, which focuses on analyzing the power in relationships as expressed through language. The article notes that "changes to the statements defining what Europe is, which hold the cardinal role in the discourse, had much more significance than others." In other words, the editors who succeeded in changing the definition of Europe were subsequently able to have their points of view better represented in the remainder of the article. Another finding suggests that the definition of European culture was much more difficult to arrive at, and spawned many more revisions throughout the article, than the discussion of the geography of Europe. Another aspect discussed in the article is the blurry boundary between Europe and the European Union. The article concludes that the borders of European culture are not the same as the borders of geographical Europe, and hence, that the difficult task of defining Europe – and revising the Wikipedia article – is bound to continue.

The significance of the first edit

A paper titled "Enrolled Since the Beginning: Assessing Wikipedia Contributors' Behavior by Their First Contribution"[13] by researchers at Telecom Bretagne looks at an editor's first contribution as an indicator of her future level of involvement in the project. After having discovered Wikipedia, the sooner one makes their first edit, the higher the likelihood they will continue editing. Reasons for the first edit matter, as those who just want to see how a wiki works are less likely to keep editing than those who want to share (improve) something specific, content-wise. Making a minor edit is much less likely to result in a highly active editor; those who will become very active are often those whose very first edit required a large investment of time. As the authors note, "it seems that those who will become the core editors of the community have a clearly defined purpose since the beginning of their participation and don’t waste their time with minor improvements on existing articles". Finally, the authors find that having a real life contact who shows one how to edit Wikipedia is much more likely to result in that person becoming a regular Wikipedia contributor, compared to people who learn how to edit by themselves.

Given enough eyeballs, do articles become neutral?

Building on their previously reviewed research, Greenstein and Zhu ask[14] "will enough eyeballs eliminate or decrease the amount of bias when information is controversial, subjective, and unverifiable?" Their research calls this into question, by taking a statistical approach to measuring bias in Wikipedia articles about US political topics, which uses Linus’ Law ("Given enough eyeballs, all bugs are shallow") as a null hypothesis.

They rely on a slant index previously developed for studying news media bias, which specifies certain code words as indicating Republican or Democratic bias. Within their sample of 28,382 articles relating to American politics, they find that the category and vintage of an article are most predictive of bias. "Topics of articles with the most Democrat words are civil rights, gun control, and homeland security. Those with the most Republican words are abortion, foreign policy, trade, tax reform, and taxation. ... [T]he slant and bias are most pronounced for articles born in 2002 and 2003". While they do not find a neutral point of view within each article or topic, across articles, Wikipedia balances Democratic and Republican points of view.

Yet answering "Why did Wikipedia become less biased over time?" is more challenging. They classify explanatory variables into three groups: attention and editing; dispersion of contributions; and article features. The narrow interpretation of Linus' Law would make attention and editing the only relevant feature (not supported by their data), while a broader interpretation would also take dispersion into account (weak support from their data). While both the number of revisions and the number of editor usernames are statistically significant, they work in opposite directions. Pageviews, while also statistically significant, are unavailable before February 2007. They also suggest questions for further work, including improvements to their revision sampling (they "divide [each article's] revisions into ten revisions of equal length") and overall sampling method (which uses the same techniques as their earlier work).

Navigating conceptual maps of Wikipedia language editions

A paper from this year’s Conference on Human Factors in Computing Systems (CHI 2012) entitled "Omnipedia: Bridging the Wikipedia Language Gap"[15] presents the features of Omnipedia, a system that enables readers to analyse up to 25 language editions of Wikipedia simultaneously. The study also includes a review of the challenges that the architects faced in building the Omnipedia system, as well as the results of initial user testing. According to the authors, language barriers produce a silo effect across the encyclopedias, preventing users from being able to access content unique to different language editions. Omnipedia, they write, reduces the silo effect by enabling users to navigate different concepts (over 7.5 million of them) from up to 25 language editions of Wikipedia, highlighting similarities and differences in an interactive visualization that shows which concepts different editions mention and how each of those topics is discussed.

The authors provide the example of the English Wikipedia article on conspiracy theory, showing how it discusses many topics – from “Moon landing” to “Kennedy assassination”. Other language editions contain articles on the same concept, including Verschwörungstheorie in the German Wikipedia and teoria conspirativa in the Spanish Wikipedia. Omnipedia consolidates these articles into a single "multilingual article" on conspiracy theories, showing which language editions have topics discussed in only one language edition and which have those discussed in multiple language editions.

The paper concludes with the results of user testing, showing how the volume of single-language topics was "a revelation to the majority of users" but also how users targeting concepts they thought might reveal differences in perspective (for example on "Climate scepticism" or the "War on the Terror") actually had fewer differences than anticipated. The authors conclude by highlighting their contributions to this area of study, including a system that for the first time allows simultaneous access to large numbers of Wikipedia language editions – powered by several new algorithms that they assert “preserve diversity while solving large-scale data processing issues” – and a demonstration of the value of Omnipedia to user analysis of concepts explored in different language editions.



  1. ^ Collier, B., & Bear, J. (2012). Conflict, criticism, or confidence. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW ’12 (p. 383). New York, New York, USA: ACM Press. PDFDOIClosed access icon
  2. ^ Nagar, Y. (2012) What do you think?: the structuring of an online community as a collective-sensemaking process. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW ’12. New York, New York, USA: ACM Press. PDF DOI Open access icon
  3. ^ Zube, P., Velasquez, A., Ozkaya, E., Lampe, C., & Obar, J. (2012). Classroom Wikipedia participation effects on future intentions to contribute. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW ’12 (p. 403). New York, New York, USA: ACM Press. PDF DOI Closed access icon
  4. ^ Erik Olin Wright (2012) Writing Wikipedia Articles as a Classroom Assignment, ASA Newsletter (Teaching Sociology), February 2012 PDF Open access icon
  5. ^ a b Keegan, Brian, Darren Gergle, and Noshir Contractor (2012). Do Editors or Articles Drive Collaboration? Multilevel Statistical Network Analysis of Wikipedia Coauthorship. In 2012 ACM Conference on Computer Supported Cooperative Work (CSCW '12). PDF Open access icon
  6. ^ Zhu, H., Kraut, R., & Kittur, A. (2012). Effectiveness of shared leadership in online communities. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW ’12 (p. 407). New York, New York, USA: ACM Press. PDFDOI Open access icon
  7. ^ Rzeszotarski, J., & Kittur, A. (2012). Learning from history. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work – CSCW ’12 (p. 437). New York, New York, USA: ACM Press. DOI Closed access icon
  8. ^ Zhu, H., Kraut, R., & Kittur, A. (2012). Organizing without Formal Organization: Group Identification, Goal Setting and Social Modeling in Directing Online Production. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work – CSCW ’12 (p. 935). New York, New York, USA: ACM Press. PDFDOI Open access icon
  9. ^ Solomon, J., & Wash, R. (2012). Bootstrapping wikis. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW ’12 (p. 261). New York, New York, USA: ACM Press. PDFDOIOpen access icon
  10. ^ Antin, J., Cheshire, C., & Nov, O. (2012). Technology-mediated contributions. Editing Behaviors Among New Wikipedians. Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work - CSCW ’12 (p. 373). New York, New York, USA: ACM Press. PDFDOI Open access icon
  11. ^ Keegan, Brian C (2012). Breaking news on Wikipedia: Dynamics, structures, and roles in high-tempo collaboration. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work Companion – CSCW '12, New York, New York, USA: ACM Press, 2012. PDFClosed access icon
  12. ^ Miletic, Dušan (2012). Europe According to English Wikipedia. Open-sourcing the Discourse on Europe, Masters Thesis, Jagiellonian University PDFOpen access icon
  13. ^ Dejean, Sylvain, and Nicolas Jullien (2012). Enrolled Since the Beginning: Assessing Wikipedia Contributors' Behavior by Their First Contribution. SSRN Electronic Journal PDFOpen access icon
  14. ^ Zhu, Feng, and Shane Greenstein (2012). Collective Intelligence and Neutral Point of View: The Case of Wikipedia PDFOpen access icon
  15. ^ Bao, Patti, Brent Hecht, Samuel Carton, Mahmood Quaderi, Michael Horn, and Darren Gergle (2012) Omnipedia: Bridging the Wikipedia Language Gap. In: Proc. CHI 2012. PDFOpen access icon
  16. ^ Chen, Mike (2011). Taxonomy Extraction from Wikipedia, Masters Thesis, Ohio University. PDFOpen access icon
  17. ^ Garcì­a, Renato Domi­nguez, Philipp Scholl, and Christoph Rensing (2011). Supporting Resource-based Learning on the Web using automatically extracted Large-scale Taxonomies from multiple Wikipedia versions. Advances in Web-based learning - ICWL 2011, LNCS 7048. PDFOpen access icon
  18. ^ Haralambous, Yannis, and Vitaly Klyuev (2012). Wikipedia Arborification and Stratified Explicit Semantic Analysis, Computation and Language (January 30, 2012): 13. PDF Open access icon
  19. ^ Yasseri, Taha;Sumi, Robert;Rung, András;Kornai, András;Kertész, János (2012) Dynamics of conflicts in Wikipedia. Physics and Society; Data Analysis, Statistics and Probability. ArXiV (February 16, 2012). PDFOpen access icon
  20. ^ Park, Namkee, Hyun Sook Oh, and Naewon Kang (2012). Factors influencing intention to upload content on Wikipedia in South Korea: The effects of social norms and individual differences. Computers in Human Behavior 28(3), May 2012: 898-905. DOI Closed access icon
  21. ^ Skaggs, Bradley Alan (2011). Topic Modeling for Wikipedia Link Disambiguation, Masters Thesis, University of Maryland HTMLOpen access icon
  22. ^ Rothfels, John, Brennan Saeta, and Emin Topalovic (2011). A recommendation engine for Wikipedia articles based on constrained training data. PDFOpen access icon
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

Interesting report again this month. Thanks. Pinetalk 04:43, 28 February 2012 (UTC)[reply]

Are there automated bots working currently which are similar to Betacommand? Also, is there a way that we can communicate to the people who are using all these automated tools with negative messages? Or maybe restructure the bots themselves to make them more gentle? II | (t - c) 04:20, 4 March 2012 (UTC)[reply]

Comment. Concerning "Gender gap connected to conflict aversion and lower confidence among women". The first reason listed: "Female Wikipedia editors are less likely to contribute to Wikipedia due to the high level of conflict involved in the editing, debating, and defending process." I believe this follows along with the recent Village Pump discussion I initiated concerning the creation of a separate noticeboard for dealing with admin misconduct. It has now finished. It is enlightening to read the full discussion. Admins do little to stop conflict. In fact many admins create or escalate conflict due to their misconduct. Wikipedia is not researching this from the top down. For a summary and a link to the discussion: User:Timeshifter/Unchecked admin misconduct. --Timeshifter (talk) 05:40, 5 March 2012 (UTC)[reply]

Comment response: I do agree that there is surely conflict instigated by people in Wikipedia with power roles. That's what power can do to people - corrupt, as the old saying goes. While not all admins are like that, people can abuse their powers. As a female editor I surely do avoid certain areas of Wikipedia because of fear of conflict. I've surely become paranoid about my contributions, thanks to Wikipedia. So I can only imagine how others with less experience than I feel. SarahStierch (talk) 14:14, 5 March 2012 (UTC)[reply]


The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0