The Signpost

Recent research

How Wikipedia built governance capability; readability of plastic surgery articles

Contribute  —  
Share this
By Piotr Konieczny, Leeza Rodriguez and Tilman Bayer

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

How Wikipedia built governance capability, 2001–2009

This paper[1] looks at the topic of Wikipedia governance in the context of online social production, which is contrasted with traditional, contract-bound, hierarchical production models that characterize most organizational settings. Building on the dynamic capabilities theory, the authors introduce a new concept, "collective governance capability", which they define as "the capability of a collective arrangement to steer a production process and an associated interaction system". The authors ask the research question, "How does a collective governance capability to create and maintain value emerge and evolve in online social production?"

Figure from the paper: "The number of monthly contributors and the number of contributor clusters in the English Wikipedia from January 2001 to December 2009."
  1. Quantitative analysis: The authors processed a dump of the full history of the English Wikipedia's first nine years. For each of the 108 months from January 2001 to December 2009 and each editor, that editor's activity was described by the following numbers: "the number of edits and pages edited, median [Levenshtein] edit distance and article length change, the number of reverted edits, and reverts done [...., in] four namespaces: encyclopedia articles, article talk pages, policies and guidelines, and policies and guidelines talk pages". A cluster analysis is then performed for each month to group editors into sets of similar editing behavior. The authors report:
    "we identify a slow initiation period followed by a period of extremely rapid growth, and, finally, levelling out and a slight decline. In the first phase, there is only a minimal differentiation of contributors into clusters. The second phase of exponential growth is characterized by increasing differentiation of contributors, while the number of clusters stabilizes in the third phase. The statistics provide only a very rough depiction of a complex system, but they certainly suggest that, whatever governance mechanisms have been in place, they have had to deal with dramatically different circumstances over the years."
  2. qualitative analysis: Building on these three phases identified via descriptive statistics, the authors construct "theoretical narrative ... [using] a highly selective representation of empirical material that advances the plot of capability-building", including discussion of the history of policies, processes and events including IAR, 3RR, FAR, bot policy, flagged revisions, the 2005 Nature study comparing Wikipedia's quality with Britannica's, the Seigenthaler affair the same year, etc.

The researchers note that Wikipedia governance has changed significantly over the years, becoming less open and more codified, which they seem to acknowledge as a positive change. The authors' main conclusion stresses, first, that governance could itself be a dynamic, evolving process. Second, that new kinds of governance mechanisms make it possible to create significant value by harnessing knowledge resources that would be very difficult to seize through a market or corporate system. Third, that the lack of a contractually sanctioned governance framework means that people have to learn to deal directly with each other through peer-based interaction and informal agreements, which in turn creates opportunities for self-improvement through learning. Fourth, the authors note that the new type of governance models are constantly evolving and changing, meaning they have a very fluid structure that is difficult to describe, and may be better understood instead as changing combinations of different, semi-independent governance mechanisms that complement one another. Finally, they stress the importance of technology in making those new models of governance possible.

Readability of plastic surgery articles examined

The subject of readability of online patient materials for Plastic Surgery topics was recently assessed by teams from Beth Israel Medical Center at the Harvard Medical School. Readability scores are generally expressed as a grade level: Higher grade levels indicate that that content is more difficult to read. According to the authors, "nearly half of American adults have poor or marginal health literacy skills and the NIH (National Institute of Health) and AMA (American Medical Association) have recommended that patient information should be written at the sixth grade level". The aim of their research was to calculate readability scores for the most popular web pages displaying procedure information and compare the results to the sixth grade reading level recommendation.


The core author group published two papers, "Online Patient Resources for Liposuction"[2], in Annals of Plastic Surgery , and "Assessment of Online Patient Materials for Breast Reconstruction"[3], in Journal of Surgical Research. The authors concentrated on the topics of "liposuction" and "tattoo information" in one paper, and focused solely on the topic of "breast reconstruction" in the second paper. Readability scores were accessed in both papers, but the breast reconstruction paper added an analysis of ‘complexity’ and ‘suitability’ to more comprehensively evaluate reading level.

For each procedure term topic, websites selected for analysis were based on the top 10 links resulting from the Google search query. The top 10 links were identified as the 10 most common websites for that search term.

Illustration from the liposuction article

Results and conclusions

The authors concluded that the readability of online patient information for ‘liposuction’ and ‘breast reconstruction’ is ‘too difficult’ for many patients as the readability scores of all 20 websites (10 each) far exceeds that of a 6th-grade reading level. The average score for the most popular ‘liposuction’ websites was determined equal to 13.6-grade level. As a comparison ‘tattoo information’ scored at the 7.8-grade level.

Health care information available at the most popular websites for ‘breast reconstruction’ had an average readability score of 13.4, with 100% of the top 10 websites providing content far above the recommended 6th grade reading level . readability scores aligned at the higher readability range for both terms, with scores above the 14 grade level for ‘liposuction’, and above grade 15 for ‘breast reconstruction’.

When other metrics such as ‘complexity’ and ‘suitability’ were applied to the Breast Reconstruction websites, the content appeared to be more friendly towards less educated readers. Complexity analysis using PMOSE/iKIRSCH yielded an average score of 8th–12th grade level. In a testament to how images and topography enhance user readability, the breast reconstruction paper also employed the SAM ‘suitability’ formula. This metric concluded that 50% of the websites were ‘adequate’. The SAM formula gives weight to the contribution that images, bulleted lists, subheadings, and video make to the readability of content. was found to be ‘unsuitable’ along with,,, and

In conjunction with the ‘readability score’, the PMOSE and SAM metric helped to achieve a more comprehensive view of a patient’s ability to read and comprehend the breast reconstruction material.

Liposuction paper methodology

After articles from the 10 websites with liposuction content were stripped of images and videos, the plain text content was analyzed using ten established readability formulas. These included Coleman–Liau, Flesch–Kincaid, Flesch reading ease, FORCAST, Fry graph, Gunning fog, New Dale–Chall, New Fog count, Raygor estimate, and SMOG. All readability formulas in this paper relied on some combination of word length, syllable count, word complexity, and sentence length. Longer word lengths and sentence lengths compute to higher reading levels. Similarly, words of three or more syllables increase the grade level readability scores. These text-based readability scores do not include the impact that images or graphics have on readers.

In an effort to compare readability scores for a procedure ‘similar’ to liposuction, the authors performed the same type of analysis on the term ‘tattoo information’. Not surprisingly, the query for ‘tattoo information’, a simpler procedure, yielded content with average readability scores of 7.8-grade level.

Based on this wide gap of 5.8 grade levels in readability scores between ‘liposuction’ and ‘tattoo’ literature, the authors pose the question , “So why is this (tattoo) information significantly easier to read than liposuction?” The authors do present good example strategies for rewriting some liposuction content at lower reading levels. However, the authors do not convincingly clarify why the two procedures should have similar low readability levels. The average education levels of the target audience for "liposuction" and "tattoo information" is not well documented in the paper, and it is questionable if they are equal.

According to ASPS statistics, 50% of liposuction patients are over 40 years old. Are 50% of the people seeking tattoos over age 40? While age does not equal reading level, it may certainly give a hint.

Furthermore, the authors downplay the complexity of the liposuction procedure in comparison to tattooing. Liposuction is an invasive procedure performed by a credentialed surgeon and anesthesiologist under IV or General Anesthesia in an accredited outpatient surgery center. The tools, equipment, and anesthetics used in the technique are not simple, common words.

Unlike surgeons, tattoos artists do not require any type of formal medical training or certification. The tattoo procedure does not involve the complexities of pre-operative clearance, fat extraction , fluid and electrolyte regulation, anesthesia administration , or vital sign monitoring. Likewise, the liposuction procedure description is destined to be longer, more technical, and likely requires higher readability levels than tattooing.

One consideration which is not discussed by these and other published authors evaluating online content readability, is the fact that Google uses the Dale-Chall and Flesch Kincaid readability formulas in its Penguin algorithm. However, rather than punish high (difficult) readability scores, the algorithm is thought to punish low grade level readability scores. In 2013, the UK analytics company MathSight determined[supp 1] that the Penguin algorithm penalized websites with low grade level readability scores. After the MathSight finding, many SEO experts concluded that Google favors content written at a higher educational level.

In light of this, and regarding the typical methodology of obtaining the data set from Google’s top 10 links, one must question if Google would ever rank a medical content website with a grade 6 readability score higher than a website with a grade 13 readability score. Perhaps even more importantly, most website publishers want what Google wants. Competition is fierce for a spot in the top 10 links. Therefore, as long as online content publishers believe that Google favors well written, well researched, sophisticated content, it might be a tough sell to persuade medical content publishers to oversimplify their content to a sixth grade reading level.


Other recent publications

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.


  1. ^ Aaltonen, Aleksi; Lanzara, Giovan Francesco (2015-06-09). "Building Governance Capability in Online Social Production: Insights from Wikipedia". Organization Studies. 36 (12): 1649–1673. doi:10.1177/0170840615584459. ISSN 1741-3044. S2CID 1629212.
  2. ^ Vargas, Christina R.; Ricci, Joseph A.; Chuang, Danielle J.; Lee, Bernard T. (February 2015). "Online Patient Resources for Liposuction: A Comparative Analysis of Readability". Annals of Plastic Surgery. 76 (3): 349–354. doi:10.1097/SAP.0000000000000438. ISSN 0148-7043. PMID 25695442. S2CID 6726621. Closed access icon / freely available authors' copy
  3. ^ Vargas, Christina R.; Kantak, Neelesh A.; Chuang, Danielle J.; Koolen, Pieter G.; Lee, Bernard T. (2015). "Assessment of Online Patient Materials for Breast Reconstruction". Journal of Surgical Research. 199 (1): 280–286. doi:10.1016/j.jss.2015.04.072. ISSN 0022-4804. PMID 26088084. Closed access icon
  4. ^ Hara, Noriko; Doney, Jylisa (2015-05-19). "Social construction of knowledge in Wikipedia". First Monday. 20 (6). doi:10.5210/fm.v20i6.5869. ISSN 1396-0466.
  5. ^ Heilman, James M; West, Andrew G (2015-03-04). "Wikipedia and Medicine: Quantifying Readership, Editors, and the Significance of Natural Language". Journal of Medical Internet Research. 17 (3): –62. doi:10.2196/jmir.4069. ISSN 1438-8871. PMID 25739399.
  6. ^ John Thomas Oliver (2015-02-09). "One-shot Wikipedia: an edit-sprint toward information literacy". Reference Services Review. 43: 81–97. doi:10.1108/RSR-10-2014-0043. ISSN 0090-7324. Closed access icon
  7. ^ Alexander Hogue, Joel Nothman and James R. Curran. 2014. Unsupervised biographical event extraction using wikipedia traffic. In Proceedings of Australasian Language Technology Association Workshop, pages 41–49.
  8. ^ Egle, Jonathan P.; Smeenge, David M.; Kassem, Kamal M.; Mittal, Vijay K. (April 2015). "The Internet School of Medicine: Use of electronic resources by medical trainees and the reliability of those resources". Journal of Surgical Education. 72 (2): 316–320. doi:10.1016/j.jsurg.2014.08.005. ISSN 1878-7452. PMID 25487347. Closed access icon
  9. ^ Jankowski-Lorek, Michal; Ostrowski, Lukasz; Turek, Piotr; Wierzbicki, Adam (2014). "Wikipedia Knowledge Community Modeling". In Reda Alhajj; Jon Rokne (eds.). Encyclopedia of Social Network Analysis and Mining. Springer New York. pp. 2410–2420. doi:10.1007/978-1-4614-6170-8_269. ISBN 978-1-4614-6169-2. Closed access icon
  10. ^ Sajadi, Armin; Milios, Evangelos E.; KeÅ¡elj, Vlado; Janssen, Jeannette C. M. (2015). "Domain-Specific Semantic Relatedness from Wikipedia Structure: A Case Study in Biomedical Text". In Alexander Gelbukh (ed.). Computational Linguistics and Intelligent Text Processing. Lecture Notes in Computer Science. Vol. 9041. Springer International Publishing. pp. 347–360. doi:10.1007/978-3-319-18111-0_26. ISBN 978-3-319-18110-3. Closed access icon
  11. ^ Herbert, Verena G.; Frings, Andreas; Rehatschek, Herwig; Richard, Gisbert; Leithner, Andreas (2015-03-06). "Wikipedia – challenges and new horizons in enhancing medical education". BMC Medical Education. 15 (1): 32. doi:10.1186/s12909-015-0309-2. ISSN 1472-6920. PMC 4384304. PMID 25879421.
  12. ^ Yasseri, Taha (25 April 2015). "Coverage of European parties in European language Wikipedia editions". Can social data be used to predict elections?.
  13. ^ Tran, Khoi-Nguyen; Christen, Peter; Sanner, Scott; Xie, Lexing (2015-05-19). "Context-Aware Detection of Sneaky Vandalism on Wikipedia Across Multiple Languages". In Tru Cao; Ee-Peng Lim; Zhi-Hua Zhou; Tu-Bao Ho; David Cheung; Hiroshi Motoda (eds.). Advances in Knowledge Discovery and Data Mining. Lecture Notes in Computer Science. Vol. 9077. Springer International Publishing. pp. 380–391. doi:10.1007/978-3-319-18038-0_30. ISBN 978-3-319-18037-3. Closed access icon
  14. ^ Alonso, Elisa (2015-02-13). "Google and Wikipedia in the professional translation process: a qualitative work". Procedia – Social and Behavioral Sciences. 32nd International Conference of the Spanish Association of Applied Linguistics (AESLA): Language Industries and Social Change. 3–5 April 2014, Seville, SPAIN. 173: 312–317. doi:10.1016/j.sbspro.2015.02.071. ISSN 1877-0428.
  15. ^ Romero, Daniel M.; Huttenlocher, Dan; Kleinberg, Jon (2015-03-25). "Coordination and efficiency in decentralized collaboration". arXiv:1503.07431 [cs.SI].
Supplementary references and notes:
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
Not to make a big drama about this one, but in general such additions that significantly change the (bylined) content (in particular a review's overall assessment of a paper) should be made before publication time, whereas "post-publication edits such as grammatical and spelling corrections to articles are welcome". In the case of this Signpost section, there is also the additional problem that it is syndicated here and here; those versions are not being updated automatically. Regards, Tbayer (WMF) (talk) 21:28, 7 July 2015 (UTC)[reply]


The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0