This week's issue of the Signpost introduces an irregular section dedicated to summarizing recent academic research about Wikipedia and Wikimedia.
Meta-research: Trying to survey existing research literature on Wikipedia
Last month, Wikipedia researcher Finn Årup Nielsen published a draft of a survey paper titled "Wikipedia research and tools: Review and comments". He notes that "well over 1,000 reports have been published in the field" by now, making a complete review impossible, but still provides an extensive overview of publications regarding many different fields of Wikipedia research. A review on the blog of researcher Paolo Massa lists the covered fields and calls Nielsen's draft "a very useful 56-pages resource highlighting key areas of research for Wikipedia (with citations to relevant work already published). ... The cited papers (with annotations!) are 236! Even if this is draft paper, it is a super valuable resource!"
Also in March, a research group from Concordia University in Montreal, Canada announced that they were "conducting a systematic literature review on Wikipedia-related peer-reviewed academic studies published in the English language", starting with a database search that had "identified over 2,100 peer-reviewed studies that have 'wikipedia', 'wikipedian' or 'wikipedians' in their title, abstract or keywords. As this number of studies is far too large for conducting a review synthesis, we have decided to focus only on peer-reviewed journal publications and doctoral theses; we identified 625 such studies. In addition, we identified around 1,500 peer-reviewed conference articles". They updated the page Wikipedia:Academic studies of Wikipedia accordingly (bringing it to almost 1 MB in text, while a separate list of conference papers weighs 1.5 MB).
On the Wiki-research-l mailing list, the announcement gave rise to discussion about a possible shared database for Wikipedia literature review, for example using Acawiki or Zotero. It was pointed out that there had been earlier attempts that failed.
Pharmacological study criticizes reliability of Wikipedia articles about the top 20 drugs
A study titled "Reliability of Wikipedia as a medication information source for pharmacy students" (abstract) in this month's issue of the journal Currents in Pharmacy Teaching and Learning found the quality of Wikipedia articles on the 20 most frequently prescribed drugs lacking, concluding
Wikipedia does not provide consistently accurate, complete, and referenced medication information. Pharmacy faculty should actively recommend against our students' use of Wikipedia for medication information and urge them to consult more credible drug information resources.
Like an earlier study, part of the criticism was based on differing expectations on what information should be included in such articles ("Categories most frequently absent were drug interactions and medication use in breastfeeding"). The article even explicitly acknowledged that one of the information categories whose lack it criticized - namely, dosage information - was discouraged by the Wikipedia:Manual of Style (medicine-related articles), but used this to turn the fact that half of the articles fulfilled the study's requirement in that respect into additional criticism: "... our finding that 10 of the 20 articles included dosing information provides evidence for the lack of regulation of content on Wikipedia."
However, the paper's critical conclusion was also based on factual inaccuracies and the finding that "referencing was poor across all articles, with seven of the 20 articles not supported by any references." (As pointed out by WhatamIdoing, all of the articles currently contain multiple reliable sources. The quoted claim may have been intended to refer to only the part of the articles that concerned the 20 information categories studied. Also, the paper does not state which versions of the Wikipedia articles were judged, apart from noting that they "were accessed on a single day". The above mentioned Manual of style page is cited using a permalink to a May 2007 version, to describe Wikipedia regulations "at the time this analysis was performed".)
The accuracy of information on Wikipedia was judged based on whether it agreed with package inserts, or, if the Wikipedia information was not present there, with certain databases. As example for "inaccurate information that could lead to inappropriate use of medications and potential patient harm", it named the fact that the article on the diabetes drug metformin listed "lung disease as a contraindication, which is inaccurate per the Glucophage (Bristol-Myers Squibb, New York, NY) package insert. This inaccuracy could prompt a pharmacist to inappropriately recommend against the use of metformin, a medication shown to reduce mortality in the treatment of diabetes, in patients with asthma or chronic obstructive pulmonary disease. The metformin Wikipedia article also lists a higher serum creatinine for defining the contraindication of kidney disease compared with the package insert. This inconsistency could result in a recommendation to use metformin in a patient where it is contraindicated."
We could complain that Wikipedia articles are not written to give "patient information", and that therefore the study was misguided in comparing our articles "with information found in the manufacturer's package insert". ... Since there is only partial overlap between the purpose of an encyclopaedia and the purpose of patient or professional publications, any such comparison should take care to eliminate unreasonable expectations. ...
However, I can't disagree with the conclusion. I wouldn't want my builder consulting Wikipedia for mixing mortar, never mind my pharmacist using a source any fool can edit. Our drug articles are generally poor. The ratio of knowledgeable active editors to the number of drug articles is simply too small. ... A bigger project might be expected to target its activities at vital articles, but with the numbers we have, we can only really expect editors to make a decent fist of a topic that personally interests them. We need more editors.
Classifying newbies and veterans as experts, gnomes, vandal fighters or social networkers
In February, a paper titled "Finding social roles in Wikipedia" (abstract, earlier, incomplete online draft) won a Best paper award at iConference 2011, an annual gathering of US information scholars and practicioners. The seven researchers from Cornell University and other institutions first use a qualitative approach to identify an initial set of potential social roles of Wikipedia contributors, and then aim to characterize these by "quantitative signatures", derived from a dump of the English Wikipedia comprising edits until October 2006. They arrive at four "key roles" (not meant to be exhaustive or mutually exclusive), and relying on an initial sample of 40 hand-picked and hand-classified editors, they propose quantitative criteria that are based on the distribution of edits across different namespaces (divided into six categories: "content" [i.e. articles and images], "content talk, user, user talk, wikipedia, and infrastructure [the rest, e.g. templates or categories]):
"Substantive experts" who "contribute by providing substantive content to article pages" and "invest time in fact checking and article talk to discuss details of articles". Characterized by having between 30 and 80% content edits, and the non-content edits being "<30% to Infrastructure, >45% in content talk and Wiki combined; and >25% content talk".
"Technical editors" who focus on small fixes and improvements, e.g. "spelling, grammar, hyperlink format, out of date facts, links to other language editions of Wikipedia," or categorization, similar to what Wikipedians call WikiGnomes. Characterized by having more than 60% content edits, and the non-content edits being ">45% in Wiki and Infrastructure combined, and <25% content talk."
"Counter vandalism editors" who "find vandalized articles, correct them, and sanction vandals." Characterized by having more than 60% content edits, and the non-content edits being "<25% content talk, >30% user and usertalk combined, and >20% Wiki pages." The "surprisingly high rates of edits to the User and User Talk namespaces" is explained by the practice of blocking admins to place a message on the blocked user's page.
"Social networkers" who "build strong ties with other users through channels other than article collaboration". They "create elaborate profiles that showcase their Wikipedia personalities", often containing "many Userboxes, small snippets of self-identifying information including interests, group membership, and personal characteristics. Social networkers often participate in projects that can be seen as community-building", e.g. the Birthday Committee, the Welcoming committee and "parts of the now defunct 'Esperanza' project whose goal was to strengthen the Wikipedia community." Characterized by having less than 45% content edits, and the non-content edits being in "content talk less than 25%, greater than 45% user and user talk combined, greater than 25% wiki pages."
The study then uses these formal criteria (admitting that they "are quite primitive and imprecise") to classify two larger samples of editors, one consisting of 1954 "long-term dedicated" Wikipedians (defined as having made edits both in or before January 2004, and in January 2005), and the other of "new" editors, defined as all 5839 users who created an account and made at least one edit in January 2005. The ratio of Social networkers was very small in the "new" cohort and even smaller among the "dedicated" editors (1% vs. 0.5%), a finding the authors explain by the fact that "using Wikipedia for social networking was actually a relative new development in 2006". The other three roles were all found somewhat more often among the "dedicated" editors (32% vs. 28% for Substantive experts, 11% vs. 10% for Technical editors, 7% vs. 5% for Vandal fighters). Addressing concerns about the sustainability of Wikipedia's community voiced in 2005 by Eric Goldman (cf. recent Signpost coverage), the authors state "it seems that potential role players are arriving and developing at a rate that is more than sufficient to supplement and grow the current population", clearly indicating that the paper's underlying data is somehow outdated when compared, for example, to the WMF's recent Editor Trends Study.
Another section draws some informal conclusion from users' social graphs, as defined by their edits of other users' talk pages. Example: "At the most general level, technical editors and vandal fighters have similarly sparse local networks, while the social networkers and substantive experts’ networks show larger community structures", however social networkers differ from substantive experts in that the former "are likely to develop user talk networks that only include friends who are similar to themselves, or other folks that they run into in the backstage". Also, technical editors and counter vandalism editors were said to share some social network attributes with what has been called "answer people" in a study of Usenet participants, while social networkers were similar to "discussion people".
Overview of the BLP problem features analysis of subject's participation in deletion discussion
Presented at the same conference was a paper titled "Handling Flammable Materials: Wikipedia Biographies of Living Persons as Contentious Objects". It gives a rich overview about the history of the controversies about BLPs (conceptualizing them as "contentious objects") on the English Wikipedia, from the 2005 Seigenthaler affair and the Daniel Brandt controversies (Signpost coverage) to more recent community discussions about BLPs, such as when "Users Scott MacDonald and Lar began a campaign in January 2010 to delete unsourced and inadequately sourced BLP articles" (citing this diff as evidence for the "consternation of other Wikipedia editors" that it caused), and the introduction of "sticky prod" (proposed deletions) for BLPs soon afterwards, noting that the latter involved "470 editors contribut[ing] over 200,000 words of discussion". The paper names four different ways in which organizations can manage risk in general, and classifies Wikipedia's response to the BLP problem according to them:
Wikipedia's notability guideline is cited as an example for the second kind of strategy, risk minimization.
The third approach, threat management, is exemplified by the BLP policy itself (which the authors describe as "clear guidelines for editors and writers of BLP articles [written] with such a threatening attitude that they will feel compelled to follow them", noting that its introduction "is written in forceful imperative tones"), and its subpage containing advice for BLP subjects.
Lastly, there is impact containment, defined as "the development of procedures to minimize the damage once conflict occurs", for which the authors examine the AfD process for BLPs.
A statistical analysis of "257 nominations of articles for deletion where the subject of the article expressed an interest in whether or not the article is kept" (in 190 cases preferring deletion, in 63 preferring that the article should be kept, and voicing ambiguous opinions in 4 cases) found that "if the subject’s interests are stated in the nomination for the article deletion, whether or not the expressed interest is for deletion or retention of the article, [...] the article is 78% more likely to be kept". However, "the subject’s open vote in the discussion concerning whether or not to keep the article in Wikipedia, does not have an impact on the outcome of the AfD process [...], suggesting that stating one’s preference without arguing about it and, therefore without creating open conflict, lets the Wikipedia editorial community address the AfD through more minor threat reduction methods." The authors interpret this as follows: "The subject of an article who gets directly involved in the AfD discussion realizes the Contentious Object potential and forces the community to turn to conflict containment strategies that are more defensive and more conforming to community policy and less reflective of the subject’s desires."
Muhammad cartoons debate dominated by appeals to precedent, impact and relevance
A third paper from iConference 2011, titled "Lifting the veil: the expression of values in online communities" (abstract) contains "a case study of a polarized talk page debate" - namely, the controversy about whether the article Jyllands-Posten Muhammad cartoons controversy should be illustrated with an image of the cartoons themselves. The three authors from the University of Washington applied the hierarchy of values framework to a sample of 314 discussion threads, containing 2785 individual postings, randomly selected from 6094 postings made on the article's talk page from January 28th, 2006 to February, 25th 2006. Computer-mediated discourse analysis (CMDA) was used to classify "the stance expressed by the post author (at the post level)", i.e. whether they argued for the inclusion of the cartoons (55%), against it (13%) or for some kind of compromise (24%), and the "types of appeals the [Wikipedia] author uses to argue their case (at the sentence or utterance level)", from a list of ten such types, e.g. "Policy: cartoons should be retained or removed based on the explicit policies of Wikipedia", or legal arguments. The three most frequently used appeals were to impact (effect of the inclusion of the cartoons on Wikipedia and elsewhere, used in 20% of the postings), precedent (both on Wikipedia and elsewhere, e.g. illustrations in the Muhammad and Piss Christ articles, or the decision of some newspapers in Arab countries to reprint the cartoons and of CNN not to reproduce them; 18%), and the relevance of the cartoons to the article (also 18%). To the authors, this suggests that the participants in the debate "in general recognized a common set of values for Wikipedia article content", although there was disagreement about the relative priority of these (the impact appeal was number one among "against" and "compromise" postings, but only fourth among "for" postings, behind the appeal to "the stated or implied identity, mission or purpose of Wikipedia"), and "that on Wikipedia, making the correct type of appeal is crucial both to persuading other editors to agree to a decision and to enforcing that decision".
The authors' own stance in the debate becomes apparent in the introduction and the conclusions, where they argue that the decision to include the images went against the goal of "multicultural inclusivity" which they see implied in "Wikipedia’s stated ideological commitment to equal access and global empowerment", but also (somehow contradictorily) criticize "invocations of Wikipedia’s core values" in the cartoons debate because they "only served to increase polarization and defeat attempts at compromise." Venturing beyond their actual empirical findings, they warn that "without additional mechanisms for resolving cultural controversies, Wikipedia risks losing access to the valuable knowledge assets of a potentially large number of contributors and may also have trouble succeeding in its mission of being a true 'encyclopedia for everyone.'"
How high school students assess credibility of Wikipedia articles: A student essay from the University of Twente ("Appraise this, appraise that : everyday Wikipedia credibility assessments of high school students and university students") contains the results of a think aloud. According to the abstract (the full text does not seem to have been published), "we found that the three most important features for high school students are text, pictures and appearance of the articles." In addition, differences in the credibility assessments by older students (as observed in an earlier study at the same university) were found, both generally and with regard to the attention paid to the criterion of references. Last year, another student thesis from the University of Twente had also examined high school students' judgements of Wikipedia articles, with similar methodology and conclusions.