The Signpost

Recent research

Napoleon, Michael Jackson and Srebrenica across cultures, 90% of Wikipedia better than Britannica, WikiSym preview

Contribute  —  
Share this
By Taha Yasseri, Han-Teng Liao, Piotr Konieczny, Jonathan Morgan and Tilman Bayer

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Multilingual ranking analysis: Napoleon and Michael Jackson as Wikipedia's "global heroes"

An ArXiv preprint titled "Highlighting entanglement of cultures via ranking of multilingual Wikipedia articles"[1], authored by a group of physicists from France, examines the Wikipedia articles on individuals and their position in the hyperlink network of the articles in each Wikipedia language edition. There are 9 language editions studied. The authors try to locate the most "important" individuals ("heroes") in each language edition by calculating two different page rank scores: PageRank and CheiRank. After making the lists of individuals with highest ranks in each language edition (with 30 individuals in each list), overlaps between lists are investigated and local and global "heroes" are introduced. The lists of "global heroes" are topped by Napoleon for PageRank, and Michael Jackson for 2DRank. It is shown that both local and global heroes exist and while global heroes gain their central position in the network due to links from multiple other central nodes, local heroes are mostly notable because of the large number of links directly pointing to them. Finally, based on the nationality (language of origin) of the highly ranked individual, a network of languages is constructed and the position of each language in this network is analysed by calculating rank scores. The authors also analyzed the activities of those important individuals, and have found politicians and scientists to be quite often among the most important ones.

Art: Image-sharing relationship between 154 language versions of Wikipedia (from the DMI Summer School 2013)

Wikipedia as Cultural Reference: Srebrenica Massacre, Art and Menstruation

Art: Concept-sharing relationship between eight selected language versions of Wikipedia (from the DMI Summer School 2013)
Editor's note: the contributing editor of this section, Han-Teng Liao, participated at the DMI Summer School 2013, though not affiliated with the DMI or University of Amsterdam.

The book chapter of "Wikipedia as Cultural Reference" in Richard A. Rogers' book "Digital Methods"[2] can be read as an example of the "digital methods" applied to Wikipedia, or a contribution to the emerging literature on cross-language-version or cross-cultural comparison of the same or similar encyclopedia articles in global Wikipedia projects. Not to be confused with "big methods", "virtual methods", etc.[3], the Digital Methods Initiative (DMI) is a school of Internet researchers at University of Amsterdam led by Rogers to 'create a platform to display the tools and methods to perform research that ... take advantage of "web epistemology"'. Currently the DMI has built some basic Wikipedia research tools that help social scientists to analyze cross-lingual images, anonymous edits, tables of contents, etc. Thus, as part of Rogers' research agenda in advocating the "digital methods", the Wikipedia projects become both a data set and analytical devices that can be repurposed for social research: "as a cultural reference, a vigilant community, a scandal machine and a controversy diagnostic machine"[4].

Self-defined as "cultural research with Wikipedia", this chapter compared the Srebrenica Articles (The Fall of Srebrenica, the Srebrenica Massacre, and the Srebrenica Genocide) across six language versions: Dutch, English, Bosnian, Croatian, Serbian, and Serbo-Croatian. Using various kinds of datasets, ranging from creation dates, edits by interlanguage article editors and top ten editors, the numbers of victims, tables of contents, referenced websites and images used, the findings show that the principle of neutral point of view does not automatically make Wikipedia articles universal (or at least similar) across language versions. The differences, especially those specific to the Wiki medium, can be used for cultural analysis on the selected topics. The content outcome is found to reflect the dynamics between the power editors in defending their sources and content using Wikipedia policies. Among these "umbrella articles", the English version is a highly contested article among many interlanguage editors, and the Serbo-Croatian version is much softened and unifying with very few editors.

A visualisation of the Wikipedia-related images on menstruation articles across different language editions (from the DMI Summer School 2013)

Adopting and extending the digital methods, two groups of participants at the DMI summer school 2013 examined the cross-language-version differences on two topics: art and menstruation. The "Cross Lingual Art Spaces on Wikipedia" project (by Sangeet Kumar, Garance Coggins, Sarah Mc Monagle, Stephan Schlögl, Han-Teng Liao, Michael Stevenson, Federica Bardelli, and Anat Ben-David) sought to find the universal and specific articulations of the concept of art through (1) images and (2) concepts (i.e. strongly related articles), producing an image network visualization for 154 language versions and a concept network visualization for eight selected language versions. A Wikidata scraping tool was developed to identify different names for the same content for the process called "concept reference disambiguation".

The second project, "Menstruation Across Cultures Online" (by Astrid Bigoni, Loes Bogers, Zuzana Karascakova, Emily Stacey and Sarah Mc Monagle) looked at the cultural differences of Wikipedia images and Google autocomplete suggestions to find associated images and search queries. In addition, the English version of the article on menstruation was compared with other English-language sources such as Urban Dictionary and Twitter, producing an interesting cross-platform comparative tag cloud. While not full research articles, the research outcomes of the two projects nonetheless demonstrated the potential directions for cross-cultural and cross-platform comparison, when Wikipedia projects are compared among themselves or with other online platforms that contain user-generated content and/or activities.

Decline of adminship candidatures on Polish Wikipedia

A conference paper titled "Does the Acquaintance Relation Close up the Administrator Community of Polish Wikipedia?"[5] investigates why the Polish Wikipedia community of Administrators is growing slower than expected, as defined by a decrease in successful RfAs. The paper presents a useful literature review of related academic work on RfA, and is a welcome study of the under-researched population of editors at non-English Wikipedias. It seems to focus on the computer science dimension, with a developed statistics section, but little theory discussion. In this reviewer's opinion it would've been stronger if the authors engaged with more social science theory, such as the iron law of oligarchy.

The authors suggest at first such a decline may occur because administrators are chosen on the basis of acquaintance, thus creating a closed group which people lacking the right connections cannot join. Later, they conclude that this is unlikely, instead pointing to growing expectations about new candidates. Both of those would be valid hypotheses, but neither is clearly tied to any theory or previous study. The authors' analysis of the data is problematic; at one point they contradict themselves, noting that "[One of the observed phenomena] could indicate, however, that the community is closing up after all" although later their conclusion states "Our conclusion is that it cannot be claimed with certainty that the Polish Wikipedia community is closing up.".

The authors also misunderstand how the WP:RFA process works on English Wikipedia, noting that one of the key differences between Polish and English Wikipedia is voting, as in "in the case of English version of Wikipedia, new administrators are elected not by voting, but by discussion". That the authors are ready to take such policy claims at face value does cast a little doubt on the applicability of their findings.

Overall, the paper presents some interesting statistical data on trends in an understudied community, and contributes to our understanding of the governance of Wikipedia. The analysis of the received data is however rather lacking, particularly through weak ties to literature on leadership, volunteer motivation and related social science areas.

90% of Wikipedia articles have "equivalent or better quality than their Britannica counterparts" in blind expert review

A Portuguese-language dissertation at the University of Évora, titled "Colaboração em Massa ou Amadorismo em Massa?" ("Mass collaboration or mass amateurism?")[6] compared the quality of English Wikipedia with that of Encyclopaedia Britannica. As summarized in English on the author's blog, a representative random sample of 245 article pairs from both encyclopedias was generated, and "reformatted to hide [their] source and then graded by an expert in its subject area using a five-point scale. We asked experts to concentrate only on some [...] intrinsic aspects of the articles' quality, namely accuracy and objectivity, and discard the contextual, representational and accessibility aspects. Whenever possible, the experts invited to participate in the study are University teachers, because they are used to grading students' work not using the reputation of the source." They rated "90% of the Wikipedia articles ... as having equivalent or better quality than their Britannica counterparts".

First WikiSym 2013 papers available

The annual WikiSym research conference is taking place in Hong Kong from August 5 to 7. Since June, the organizers have been featuring the abstracts of the conference's papers on the conference blog, with online publication of full texts planned for August 5. But several authors have already made their papers available elsewhere:

Survey participation bias analysis: More Wikipedia editors are female, married or parents than previously assumed

The fact that Wikipedia's editing community has a huge gender gap (with vastly more male than female editors contributing to the encyclopedia) was first brought to wider attention by a 2008 survey of Wikipedia readers and editors, whose results were published by UNU-MERIT and the Wikimedia Foundation in 2010. It found that only 17.8% of US-based editors were female, and 12.7% globally. As reported in the Signpost at the time, some concerns were voiced about the possible impact of participation bias on the results (an effect which is frequent in volunteer web surveys), for example because the survey had also found a gender gap in Wikipedia readers (39.9% female in the US), in contrast to other research which estimated the gender ratio among readers closer to 50%.

A new PloS ONE paper titled "The Wikipedia Gender Gap Revisited: Characterizing Survey Response Bias with Propensity Score Estimation"[12] has made it possible for the first time to quantify this participation bias, regarding the subset of US-based editors. Using a method for propensity adjustment for web surveys first published in a 2011 statistical paper, they compare the 2008 survey with Pew Research data from around the same time, which is assumed to be free of the same kind of bias because it was based on different methodology (a phone survey), and had found 49.0% of US Wikipedia readers to be female. The authors write: "We estimate that the proportion of female US adult editors was 27.5% higher than the original study reported (22.7%, versus 17.8%), and that the total proportion of female editors was 26.8% higher (16.1%, versus 12.7%)." Likewise, they find evidence that the proportion of editors who are "married, or parents, [had] been underestimated, while the proportions of immigrants and students [had] been overestimated."

The authors emphasize that their results do not negate the existence of the gender gap in general ("the basic takeaways in regards to the underrepresentation of women in the WMF/UNU-MERIT survey remain intact"), and actually call for "the Wikimedia Foundation's strategic goal to increase female editorship to 25% [...] to be raised in light of these adjusted estimates." They observe that their method is not applicable to the three subsequent editor surveys conducted by the Wikimedia Foundation in 2011/12 (the most recent one by this reviewer), because they focused solely on editors, and therefore the necessary reader comparison data (e.g. the data from Pew Research surveys) is not available. Still, the paper's results will definitely have a positive impact on the research efforts by the Foundation and others to better understand the demographics of the Wikipedia editing community.



  1. ^ Young-Ho Eom, Dima L. Shepelyansky: Highlighting entanglement of cultures via ranking of multilingual Wikipedia articles
  2. ^ Rogers, Richard A. (2013). "Wikipedia as Cultural Reference". Digital methods. Cambridge, Massachusetts; London: The MIT Press. pp. 165–202. ISBN 9780262018838.Closed access icon (Note. A previous version of this chapter can be found (and freely accessible) here: a conference paper for the Wikipedia Academy Deutschland 2012).
  3. ^ For the five methodological views on the implications of digitization for social research, see Marres, Noortje (2012). "The redistribution of methods: on intervention in digital social research, broadly conceived". The Sociological Review. 60: 139–165. doi:10.1111/j.1467-954X.2012.02121.x. ISSN 1467-954X. Retrieved 2013-07-28.Closed access icon(Note. A pdf file can be accessed via the author's university website.)
  4. ^ See a slideshow for the DMI 2013 summer school by Erik Borra on Repurposing Wikipedia
  5. ^ Justyna Spychała, Piotr Turek, Mateusz Adamczyk: "Does the Acquaintance Relation Close up the Administrator Community of Polish Wikipedia? Analysing Polish Wikipedia Administrator Community with use of Multidimensional Behavioural Social Network [1]
  6. ^ Fernando Silvério Nifrário Rodrigues: Colaboração em Massa ou Amadorismo em Massa? Um Estudo Comparativo da Qualidade da Informação Científica Produzida Utilizando os Conceitos e Ferramentas Wiki. Universidade de Évora, 2012 English synopsis
  7. ^ Kwan Hui Lim, Amitava Datta and Michael Wise: A Preliminary Study on the Effects of Barnstars on Wikipedia Editing. PDF WikiSym '13, Aug 05-07 2013, Hong Kong
  8. ^ Brian C. Keegan: A History of Newswork on Wikipedia: WikiSym '13, Aug 05-07 2013, Hong Kong
  9. ^ Brian C. Keegan Arber Ceni Marc A. Smith: "Analyzing Multi-Dimensional Networks within MediaWikis PDF WikiSym '13, Aug 05-07 2013, Hong Kong
  10. ^ Morten Warncke-Wang, Dan Cosley, John Riedl: Tell Me More: An Actionable Quality Model for Wikipedia. WikiSym '13, Aug 05-07 2013, Hong Kong PDF
  11. ^ Taha Yasseri, Giovanni Quattrone, Afra Mashhadi: Temporal Analysis of Activity Patterns of Editors in Collaborative Mapping Project of OpenStreetMap. WikiSym '13, Aug 05-07 2013, Hong Kong PDF
  12. ^ Benjamin Mako Hill, Aaron Shaw: "The Wikipedia Gender Gap Revisited: Characterizing Survey Response Bias with Propensity Score Estimation" PLoS ONE Volume: 8, Issue: 6, DOI:10.1371/journal.pone.0065782
  13. ^ Taraborelli, Dario (July 30, 2013). "Researching collaboration for a better world: John T. Riedl (1962 – 2013)". Wikimedia Blog. Retrieved July 31, 2013.
  14. ^ Aaltonen, Aleksi; Kallinikos, Jannis (2012). "Coordination and Learning in Wikipedia: Revisiting the dynamics of exploitation and exploration" (PDF). Research in the Sociology of Organizations. Emerald Group Publishing Limited. Retrieved July 31, 2013.
  15. ^ Liao, Han-Teng (July 30, 2013). "Chinese conditions on user-generated content and online encyclopedias: press-friendly background materials". Oxford Internet Institute Blog. Retrieved July 31, 2013.
  16. ^ De Rosnay, Melanie (2013). Peer production online community infrastructures. First Conference on Internet Science. The FP7 European Network of Excellence in Internet Science ( Retrieved July 31, 2013.
  17. ^ Donna Lind Infeld and William C. Adams: Wikipedia as a Tool for Teaching Policy Analysis and Improving Public Policy Content Online. Journal of Public Affairs Education / JPAE 19 (3), 445–459 (Summer 2013) PDF
  18. ^ Sarah Hernandez, Natalie Rector: Becoming Digital Citizens: Using Wikipedia to Enhance the Classroom PDF
  19. ^ Amy Roth, Rochelle Davis, Brian Carver: Assigning Wikipedia editing: Triangulation toward understanding university student engagement
  20. ^ Lars Bremer and Kalman Graffi: "Symbiotic Coupling of P2P and Cloud Systems: The Wikipedia Case". In: IEEE ICC'13: Proc. of the IEEE International Conference on Communications. PDF
  21. ^ Andrew G. West: Damage detection and mitigation in open collaboration applications. Dissertation in Computer and Information Science, University of Pennsylvania, May 2013
  22. ^ Maik Anderka: Analyzing and Predicting Quality Flaws in User-generated Content: The Case of Wikipedia
  23. ^ Jing-Woei Li et al.: The NGS WikiBook: a dynamic collaborative online training effort with long-term sustainability. Briefings in Bioinformatics, doi:10.1093/bib/bbt045
  24. ^ Xavier Ochoa, Gladys Carrillo, Ana Casali, Claudia Deco, Valeria Gerling: Analysis of Existing Technological Platforms for the Collaborative Production of Open Textbooks
  25. ^ Chandra Sekhar Bhagavatula, Thanapon Noraset, Doug Downey: Methods for Exploring and Mining Tables on Wikipedia. IDEA’13, August 11th, 2013, Chicago, IL, USA. PDF
  26. ^ M Matuschek, CM Meyer, I Gurevych: "Multilingual Knowledge in Aligned Wiktionary and OmegaWiki for Translation Applications"
  27. ^ Heather Ford: Onymous, pseudonymous, neither or both? Ethnography Matters blog, June 27, 2013
  28. ^ Claudia Müller-Birn, Leonhard Dobusch, James D. Herbsleb: "Work-to-Rule: The Emergence of Algorithmic Governance in Wikipedia" C&T '13 June 29 - July 02 2013, Munich, Germany PDF

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.


Wavelength (talk) 15:47, 2 August 2013 (UTC)[reply]
  • >>The authors' analysis of the data is problematic; at one point they contradict themselves, noting that "[One of the observed phenomena] could indicate, however, that the community is closing up after all" although later their conclusion states "Our conclusion is that it cannot be claimed with certainty that the Polish Wikipedia community is closing up.".<< How is this a contradiction? It looks simply like the authors first make a hypothesis and then later come to the conclusion that it was false. --MF-W 23:36, 2 August 2013 (UTC)[reply]


The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0