A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
CSCW '14 retrospective
The 17th ACM Conference on Computer-supported cooperative work and Social Computing (CSCW '14) took place this month in Baltimore, Maryland.[supp 1] The conference brought together more than 500 researchers and practitioners from industry and academia presenting research on "the design and use of technologies that affect groups, organizations, communities, and networks." Research on Wikipedia and wiki-based collaboration has been a major focus of CSCW in the past. This year, three papers on Wikipedia were presented:
The rise of alt.projects in Wikipedia. Jonathan Morgan from the Wikimedia Foundation and collaborators from the University of Washington analyzed the nature of collaboration in alternative WikiProjects, i.e. projects that the authors identify as not following "the conventional pattern of coordinating a loosely defined range of article creation and curation-related activities within a well defined topic area" (examples of such alternative WikiProjects include the Guild of Copy Editors or WikiProject Dispute Resolution). The authors present an analysis of editing activity by members of these projects that are not focused on topic content editing. The paper also reports data on the number of contributors involved in WikiProjects over time: while the number of editors participating in conventional projects decreased by 51% between 2007 and 2012, participation in alternative projects only declined by 13% in the same period and saw an overall 57% increase in the raw number of contributions.
Categorizing barnstars via Mechanical Turk. Paul Andre and collaborators from Carnegie Mellon University presented a study showing how to effectively crowdsource a complex categorization task by assigning it to users with no prior knowledge or domain expertise. The authors selected a corpus of Wikipedia barnstars and showed how different task designs can produce crowdsourced judgments where Mechanical Turk workers accurately match expert categorization. Expert categorization was obtained by recruiting two Wikipedians with substantial editing activity as independent raters.
Understanding donor behavior through email. A team of researchers from Yahoo! Research, the Qatar Computing Research Institute and UC Berkeley analyzed two months of anonymized email logs to understand the demographics, personal interests and donation behavior of individuals responding to different fundraising campaigns. The results include donation email from the Wikimedia Foundation and indicate that among other campaigns, email from a wikimedia.org domain had the highest score of messages tagged for spam over total messages read, which the authors attribute to spoofing. The paper also indicates that the Wikimedia fundraiser tends to attract slightly more male than female donors.
Building on the streams of rating editors by content persistence and algorithmically finding cliques of editors, Nakamura, Suzuki and Ishikawa propose a sophisticated tweak to find like- and disparate-minded editors, and test it against the Japanese Wikipedia. The method works by finding cliques in a weighted graph between all editors of an article and weighting the edges by the agreement or disagreement between editor. To find the agreement between two editors, they iterate through the full edit history and use the content persistence axioms of interpreting edits that are leaving text unchanged as agreement, and deleting text as disagreement. Addressing that leaving text unchanged is not always a strong indication of agreement, they normalize by each action's frequency of both the source editor and the target editor. That is, the method accounts for the propensity of an editor to change text, and the propensity of editors to have their text changed.
To verify their method, its results are compared to a simplified weighting scheme, random clustering, and human-clustered results on seven articles in the Japanese Wikipedia. In six out of seven articles, the proposed technique beats simplified weighting. An example they present is their detection of pro- and anti-nuclear editors on the Nuclear Power Plant article. An implication of such detection would be a gadget that colours text of an article depending on which editor group wrote it.
Monthly research showcase launched
The Wikimedia Foundation's Research & Data team announced its first public showcase, a monthly review of work conducted by researchers at the Foundation. Aaron Halfaker presented a study of trends in newcomer article creation across 10 languages with a focus on the English and German Wikipedias (slides). The study indicates that in wikis where anonymous users can create articles, their articles are less likely to be deleted than articles created by newly registered editors. Oliver Keyes presented an analysis of how readers access Wikipedia on mobile devices and reviewed methods to identify the typical duration of a mobile browsing session (slides). The showcase is hosted at the Wikimedia Foundation every third Wednesday of the month and live streamed on YouTube.
Study of AfD debates: Did the SOPA protests mellow deletionists?
A paper titled "What influences online deliberation? A wikipedia [sic] study" studies rationales used by participants in deletion discussions, in the larger context of democratic online deliberation. The authors reviewed in detail deletion discussions for a total of 229 articles, listed for deletion on three dates, one of them being January 15th, 2012, three days before the the English Wikipedia's global blackout as part of the Wikipedia:SOPA initiative. The authors looked into whether this event would influence rationales of the deletion discussions and their outcome. They also reviewed, in less detail, a number of other deletions from around the time of the SOPA protest. The authors display a good knowledge of relevant literature, including that in the field of Wikipedia studies, presenting an informative literature review section.
Overall, the authors find that the overall quality of the discussions is high, as most of the participants display knowledge of Wikipedia's policies, particularly on the notability and credibility (or what we would more likely refer to as reliability) of the articles whose deletion is considered. In re, notability far outweighs the second most frequent rationale, credibility (reliability). They confirm that the deletion system works as intended, with decisions made by majority voters.
Interestingly, the authors find that certain topics did tend to trigger more deletion outcomes, said topics being articles about people, for-profit organizations, and definitions. In turn, they observe that "locations or events are more likely to be kept than expected, and articles about nonprofit organizations and media are more likely to be suggested for other options (e.g., merge, redirect, etc.) than expected". Discussions about people and for-profit organizations were more likely to be unanimous than expected, whereas articles about nonprofit organizations, certain locations, or events were more likely to lead to a non-unanimous discussion. Regarding the SOPA protests' influence on deletion debates, the authors find a small and short-lived increase in keep decisions following the period of community mobilization and discussion about the issue, and tentatively attribute this to editors being impacted by the idea of Internet freedom and consequently allowing free(er) Internet publishing.
The authors sum up those observations, noting that "the community members of Wikipedia have clear standards for judging the acceptability of a biography or commercial organization article; and such standards are missing or less clear when it comes to the topics on location, event, or nonprofit organization ... Thus, one suggestion to the Wikipedia community is to make the criteria of judging these topics more clear or specific with examples, so it will alleviate the ambiguity of the situation". This reviewer, as a participant of a not insignificant number of deletion discussions as well as those about the associated policies, agrees with said statement. With regards to the wider scheme, the authors conclude that the AfD process is an example of "a democratic deliberation process interested in maintaining information quality in Wikipedia".
Word frequency analysis identifies "four conceptualisations of femininity on Wikipedia"
In a linguistics student paper at Lund University, the author reviews the linguistic conceptualisation of femininity on (English) Wikipedia, with regards to whether language used to refer to women differs depending on the type of articles it is used in. Specifically, the author analyzed the use of five lexemes (a term which in the context of this study means words): ladylike, girly, girlish, feminine and womanly. The findings confirm that the usage of those terms is non-accidental. The word feminine, most commonly used of the five studied, correlates primarily to the topics of fashion, sexuality, and to a lesser extent, culture, society and female historical biographies. The second most popular is the word womanly, which in turn correlates with topics of female artists, religion and history. Girlish, the fourth most popular world, correlates most strongly with the biographies of males, as well as with the articles on movies and TV, female entertainers, literature and music. Finally, girly and ladylike, respectively 3rd and 5th in terms of popularity, cluster together and correlate to topics such as movies and TV (animated), Japanese culture, art, tobacco and female athletes. Later, the author also suggests that there is a not insignificant overlap in usage between the cluster for girlish and the combined cluster for girly and ladylike. He concludes that there are three or four different conceptualisations of femininity on Wikipedia, which in more simple terms means, to quote the author, that "people do indeed represent women in different ways when talking about different things [on Wikipedia]", with "girly and girlish having a somewhat frivolous undertone and womanly, feminine and ladylike being of a more serious and reserved nature".
The study does suffer from a few issues: a literature review could be more comprehensive (the paper cites only six works, and not a single one of them from the field of Wikipedia studies), and this reviewer did not find sufficient justification for why the author limited himself to the analysis of only 500 occurrences (total) of the five lexemes studied. A further discussion of how the said 500 cases were selected would likely strengthen the paper.
Wikipedia and the development of academic language
Ursula Reutner’s article “Wikipedia und der Wandel der Wissenschaftssprache” discusses Wikipedia's linguistic norms and style as a case study of the development of academic language.
The article is divided into three main sections. After providing some historical context about Wikipedia and the history of encyclopedias (section 1), the article focuses on linguistic norms in Wikipedia and their relation to linguistic norms in academic language (section 2). Reutner identifies five crucial linguistic norms in Wikipedia: (1) non-personal language such as the avoidance of first- and second-person pronouns, (2) neutral language as expressed in the policy of a “neutral point of view”, (3) avoidance of redundancies, (4) avoidance of unnecessarily complex wording, and (5) focus on simple syntax and the use of short independent clauses. Although Reutner mentions many well-known differences between Wikipedia and traditional forms of academic writing (e.g. the dynamic, collaborative, and partly non-academic character of Wikipedia), she stresses that the policies of Wikipedia largely follow traditional norms of academic writing.
The third section focuses on case studies of Wikipedia articles (mostly fr:Euro and it:Euro) and finds a large variety of norm violations that suggest a gap between linguistic norms and actual style in Wikipedia. Reutner's examples of biased, clumsy, and long-winded formulations hardly come as a surprise as these quality issues are well-known topics in Wikipedia research[supp 2]. However, Reutner's analysis is not limited to quality problems but also addresses further interesting features of Wikipedia articles. For example, she points out that Wikipedia differs from many print encyclopedias in Romanic languages such as the Grande Dizionario Enciclopedico (1964) or the Enciclopedia Treccani (2010) through a focus on accessibility as illustrated by the use of copular sentences at the beginning of articles and the repetition of crucial ideas and terms. Furthermore, Reutner argues that Wikipedia differs from other forms of academic writing through narrative elements and a generous use of space.
Reutner's findings raise general questions regarding the relation between Wikipedia and the development of academic language and her short conclusion makes three suggestions: First, Wikipedia's policies largely follow traditional norms of academic writing. Second, the digital, collaborative, and partly non-academic character of Wikipedia leads to “emotional and dialogic elements that are surprising in the tradition of encyclopedias“ (p.17). Third, the focus on accessibility follows an Anglo-American tradition of academic writing (even in the Italian and French language versions). Although Reutner's conclusions seem well-justified, they leave the question open whether Wikipedia reflects or even influences the general development of academic language. For example, one may argue that many of Reutner's findings are effects of the partly non-academic character of Wikipedia and therefore not representative of the development of academic language. Other linguistic features are arguably effects of collaborative text production and it would be interesting to compare Reutner's findings with other collaborative and non-collaborative forms of academic writing. Finally, one may worry that some of Reutner's findings are artifacts of a small and biased sample. For example, Reutner only considers articles (de:Euro, en:Euro, es:Euro, fr:Euro, and it:Euro) that are created by large and diverse author groups but does not discuss more specialized articles that usually only have one or two main authors. However, it is well-known that the style and quality of Wikipedia articles depends on variables such as group size and group composition[supp 3] and diverse forms of collaboration patterns[supp 4]. It would therefore be interesting to discuss Reutner's linguistic findings in the context of a more diverse sample of Wikipedia articles.
Wikipedia's assessability. A paper to be presented at the upcoming Conference on Human Factors in Computing Systems (CHI '14) by Forte, Andalibi, Park, and Willever-Farr introduces a vocabulary for "assessable design". Their framework considers social and technological approaches to information literacy in combination with consumption and production. From interviewing Wikipedians, librarians, and novices about their understanding of Wikipedia articles, the authors identify two important concepts of assessable design: provenance and stewardship. The authors then test these concepts in an experiment, finding that exposing readers to these can have large effects on their assessment of not only articles but Wikipedia as a whole. Considering whether their framework can be generalized to the assessability of content on other informational websites, the authors caution that "Wikipedia is a remarkably conservative resource given its reputation as a renegade reference. Policies surrounding citation defer to well-established publishing processes like scientific peer review and traditional journalism and prohibit the production of personalized content."
"Finding missing cross-language links in Wikipedia" is the title of a paper in the Journal of Information and Data Management. Using a combination of feature extraction and a decision tree classifier, the authors seek to discover missing inter-language links (ILL) between the English and Portuguese Wikipedia editions. The authors hypothesise that there are roughly 165,000 missing ILLs in each of the Wikipedias, but do not appear to take previous research on the overlap of Wikipedia content into consideration.[supp 5] Two novel features are introduced: category linking and ILL transitivity. Performance is evaluated using a dataset of known connected and disconnected articles where the French, Italian, and Spanish Wikipedias are used as intermediate languages for discovering link transitivity. Category linking is identified as a useful way of discovering candidate articles for linking, while link transitivity is the key feature for correctly identifying links. Today, Wikidata's central repository of ILLs makes link transitivity mostly a moot problem, but that is not addressed by the authors.
"Spillovers in Networks of User Generated Content". A discussion paper by economists at the Centre for European Economics Research (ZEW) reports an analysis of content curation and consumption under spikes of attention. The authors analyzed 23 examples of pages that underwent a sudden surge of attention, either because they were featured on the main page of the German Wikipedia, or because of a real-world news event (e.g. earthquakes). The result is that an increased exposure predictably leads to increase of both consumption and curation on neighbouring pages, as measured in terms of page requests (for consumption) and edits (for curation), though the author reports that content generation is small in absolute terms.
New papers on the use of Wikipedia in education, by practitioners. In a Portuguese-language conference paper, Brazilian Wikipedian and professor Juliana Bastos Marques "presents an experience with critical reading and edition of Portuguese Wikipedia articles in the university, in extension activities, conducted at the Federal University of Rio de Janeiro State (Unirio), in 2012", according to the English abstract. In an essay for the sociology journal Contexts, Wikipedian and sociologist (and contributor to other parts of this research newsletter) Piotr Konieczny, who has also made Wikipedia the subject of his own teaching, discusses the benefits of Wikipedia use in academia, citing the view that "a primary reason for academic reservations about Wikipedia is [a] philosophy of knowledge based on the control and management of intellectual capital".
"World’s largest study on Wikipedia: Better than its reputation" is the title of the Helsinki Times' English-language summary of a study of the Finnish Wikipedia's reliability, carried out by journalists and published in the Finnish newspaper Helsingin Sanomat.. Participating researcher Arto Lanamäki explained on the Wiki-research-l mailing list that the superlative referred to the fact that the study had "the biggest sample of articles (134) of all studies that have assessed Wikipedia content quality/credibility." Not too dissimilar to the approach of the landmark Nature study from 2005, the authors recruited "a university-level researcher with knowledge on the subject matter to be an evaluator" for each article in their sample. As summarized by the Helsinki Times, the result was that "the Finnish Wikipedia is largely error-free. The lack of errors is the area in which Wikipedia clearly got its best score. ... No less than 70 per cent of the articles were judged to be good (4) or excellent (5) with respect to lack of errors. According to the indicative evaluation scale a four means that the article has only 'scattered small errors, no big ones'." (See also earlier coverage of studies that systematically evaluate the reliability of Wikipedia articles: "Pilot study about Wikipedia's quality compared to other encyclopedias", "90% of Wikipedia articles have 'equivalent or better quality than their Britannica counterparts' in blind expert review")
^ e.g. Arazy, O., Nov, O., Patterson, R., & Yeo, L. (2011). Information quality in Wikipedia: The effects of group composition and task conflict. Journal of Management Information Systems, 27(4), 71-98.
^ Liu, J., & Ram, S. (2009, December). Who does what: Collaboration patterns in the wikipedia and their impact on data quality. In 19th Workshop on Information Technologies and Systems (pp. 175-180)