"Employing Wikipedia for good not evil" in education; using eyetracking to find out how readers read articles: Current research about Wikimedia projects.
This paper[1] is a good example of how to write articles for the "teaching with Wikipedia" field. The authors report their positive experiences with several under- and postgraduate classes at the University of Sydney, developing articles such as pregnancy vegetarianism, Cleo (magazine) or Slave Labour (mural). They describe in relative detail a number of assignments and assessment criteria, and discuss benefits that their Wikipedia assignments have for the community (improving valuable and underrepresented content) and for the students themselves (improving their writing, research and collaborative skills). The paper could benefit from a more comprehensive literature review, however: while it describes a useful set of educational activities, and rather well at that, these are not groundbreaking—practically all activities discussed in this paper have been discussed in peer reviewed literature by others. Unfortunately, the authors fail to cite many of related works (I count only about five citations to the other peer-reviewed works from the much larger field of teaching with Wikipedia). Furthermore, the authors seem unaware of the Wikipedia:Education Program. It does not appear that any of their courses so far have been registered on Wikipedia; sadly they have no on-wiki homepage allowing identification of all edited articles or participating students; it is also unclear if the instructors themselves have a Wikipedia account. This suggests a failing both on the part of the researchers (they spent years reading about, researching and engaging with the teaching with Wikipedia approach without realizing there is a major support infrastructure in place to assist them), as well as on the part of the Wikipedia community and the Education Program itself, which is clearly still not being visible enough, nor active enough to identify and reach out to such educators who have been engaged in several years of ongoing teaching on Wikipedia. Hopefully in the future we can integrate those and other educators into our framework better.
Using eyetracking to find out how Wikipedia articles are being read
Screenshot of eyetracking software (not from the papers discussed here)
Researchers from the University of Regensburg in Germany have used eyetracking methods to find out which article elements readers focus on while searching for information on Wikipedia, depending on the nature of the search task (factual information lookup, learning, or casual reading—a classification taken from a 2006 article[supp 1] about exploratory search in general).
In two 2012 articles[2][3] the researchers summarized the methodology and results of one of their lab experiments with 28 participants, which besides eyetracking also incorporated data from survey questionnaires, browser logs and electromyography for two facial muscles that indicate emotional reactions (the corrugator and the zygomaticus major). Among the results of this first study (see also a related paper in English with illustrations explaining the various article elements[4]):
During lookup tasks, tables and graphical representations were preferred (but illustrative/decorative images were almost never looked at. As the authors point out, their test question, about the number of passengers on the Titanic, focused on textual information). On the other hand, "in 'learn' tasks users concentrate more on the introduction and lists. In the 'casual leisure' area, many different content elements are used." [this and other quotes have been translated from German]
Users tend to skim the article during lookup tasks, but read more text parts in the other tasks.
According to a post-task survey, user satisfaction in both the lookup and learn tasks was independent of the number of images.
A subsequent German-language PhD thesis [5] (see also 2012 conference poster) contains much more detail, e.g. reporting that in "lookup" tasks, readers spend >45% of their time on scanning the table of content and lists in the article, in "learn" tasks these only amount to <10% of the time.
A second PhD thesis, covered in a brief paper[6] last year, examined for example which elements readers look at first within an article (from an experiment involving 163 German Wikipedia articles and 90 participants who were asked to prepare themselves for an course on the history of Bavaria in the 20th century, i.e. a "learning" overview task): The table of contents was the most frequent entry point (36%) followed by the lead section (31%) and the text body itself. The author observes further that "the article heading and images serve less often as entry point. The text heading [presumably the first section heading after the lead] and image captions very rarely occurs as points of first contact".
Another publication[7] by the same author focused on "users' interaction with pictorial and textual contents ...[ The spread] of information within the articles and the relation between text and images are analyzed. ... By now 30 articles have been analyzed according to this scheme. [Within these, there] are 639 contact points leading to images. Results show that 39% of all contact points lead from image to image, in mutual directions (previous or next). All text contact points [e.g. citations] sum up to a total of 37%. In 5% of all cases, an introduction triggers a saccade to an image. The remaining types of contact points occur rather rarely."
A later overview article[8] summarizes other aspects in less detail, e.g:
More experienced readers used the table of contents less often.
Overall, search strategies did not differ a lot between the "learning" and casual reading ("non-work-based") tasks. But there were statistically significant differences to the information seeking behavior in fact lookup tasks. The largest differences concerned the consumption of text, images and TOC (cf. above). Readers also spent a larger ratio of time navigating compared to analyzing content.
(For an overview over other new data sources shedding light on how readers navigate within articles, see also this reviewer's recent tech talk at the Wikimedia Foundation, and a research overview page on Meta about the question "Which parts of an article do readers read?)
Other recent publications
An analysis used Wikipedia to rank Jimi Hendrix as the most influential rock guitarist
A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.
"Political Advertising on the Wikipedia Market Place of Information"[9] From the abstract: "Wikipedia’s popularity and reputation give politicians incentives to use it for enhancing their online appearance effectively and tailored towards their constituency. [...] we assemble data covering editing activity for articles on all 1,100 members of the German parliament (MPs) for the three last legislatures. We find editing to be a persistent phenomenon that is practiced by a substantial amount of MPs and is growing throughout election years."
"Identifying missing dictionary entries with frequency-conserving context models"[10] From the abstract: "Upon training our model with the Wiktionary—an extensive, online, collaborative, and open-source dictionary that contains over 100,000 phrasal-definitions—we develop highly effective filters for the identification of meaningful, missing phrase-entries. With our predictions we then engage the editorial community of the Wiktionary and propose short lists of potential missing entries for definition, developing a breakthrough, lexical extraction technique, and expanding our knowledge of the defined English lexicon of phrases."
"Population automation: An interview with Wikipedia bot pioneer Ram-Man"[11] From the abstract: ".... an in-depth interview with Wikipedia user Ram-Man, [...] creator or the rambot, the first mass-editing bot. Topics discussed include the social and technical climate of early Wikipedia, the creation of bot policies and bureaucracy, and the legacy of rambot and Ram-Man’s work."
"Mining Wikipedia to Rank Rock Guitarists"[12][predatory publisher] From the abstract: "The influence of a guitarist was estimated by the number of guitarists citing him/her as an influence and the influence of the latter. [...] The results are most interesting and provide a quantitative foundation to the idea that most of the contemporary rock guitarists are influenced by early blues guitarists. Although no direct comparison exist, the list was still validated against a number of other best-of lists available online and found to be mostly compatible."
Predicting tennis players' Wikipedia popularity from tournament performance: From the abstract of a paper titled "Untangling Performance from Success":[13] "We show that a predictive model, relying only on a tennis player's performance in tournaments, can accurately predict an athlete's popularity [as measured by Wikipedia pageviews], both during a player's active years and after retirement."
"Request for Adminship (RFA) within Wikipedia: How Do User Contributions Instill Community Trust?"[14] From the abstract: "... we examine the impact of different forms of contribution made by adminship candidates on the community's overall decision as to whether to promote the candidate to administrator. To do so, we collected data on 754 RFA cases and used logistic regression to test four hypotheses. Our results supported the role of total contribution, and clarification of contribution in RFA success while the impacts of social contribution was partially supported and the role of content contribution was not supported. Also, both control variables (tenure and number of attempts) showed significant relationships with RFA success."
"Wikidata: A platform for data integration and dissemination for the life sciences and beyond"[15] From the abstract: "Our group is [...] populating Wikidata with the seeds of a foundational semantic network linking genes, drugs and diseases. Using this content, we are enhancing Wikipedia articles to both increase their quality and recruit human editors to expand and improve the underlying data. We encourage the community to join us as we collaboratively create what can become the most used and most central semantic data resource for the life sciences and beyond."
"A matter of words: NLP for quality evaluation of Wikipedia medical articles"[16] From the abstract: "We prove the effectiveness of our approach by classifying the medical articles of the Wikipedia Medicine Portal, which have been previously manually labeled by the Wiki Project team. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified."
^
Göbel, Sascha; Munzert, Simon (2016-01-22). Political Advertising on the Wikipedia Market Place of Information. Rochester, NY: Social Science Research Network. SSRN2720141.
^Williams, Jake Ryland; Clark, Eric M.; Bagrow, James P.; Danforth, Christopher M.; Dodds, Peter Sheridan (2015-03-06). "Identifying missing dictionary entries with frequency-conserving context models". arXiv:1503.02120.
^Cozza, Vittoria; Petrocchi, Marinella; Spognardi, Angelo (2016-03-07). "A matter of words: NLP for quality evaluation of Wikipedia medical articles". arXiv:1603.01987 [cs.IR].
Discuss this story
I also had two of my own academic articles on Wikipedia published in March ([3], [4]), but for obvious COI reasons I am not reviewing them. If anyone enjoying this newsletter would, however, like to return the favor and review my works (feel free to be critical), I'd appreciate it :) --Piotr Konieczny aka Prokonsul Piotrus| reply here 04:29, 4 April 2016 (UTC)[reply]
I hope that the following responses address some of your concerns, which centre on the inclusion of only five works from the larger corpus of “teaching with Wikipedia” literature, the fact that the paper does not report on any “groundbreaking” activities, and that the authors don’t seem to have connections with major support structures, don’t seem to have a Wikipedia account, and so forth.
As our purpose was to promote the efficacy of using Wikipedia to assess students’ performance, we wrote our article for scholars interested in higher education assessment pedagogy and so our focus was quite narrow. The article passed peer review and was accepted by a respected higher education journal that has only every published one article on the use of Wikipedia in Education: ours. A keyword search shows that of the 6 others, 3 mentioned Wikipedia twice, 3 mentioned it 4 times and only one of those did not reinforce common negative perceptions about Wikipedia.
Perhaps not groundbreaking, it was nonetheless a breakthrough to have such a high-ranking journal in the field accept our article for publication, particularly as we were advocating something that is not appealing to many conservative academics. In order to be accepted by such a high ranking journal, it was essential for us to show the ways our practices hinge on the theories of scholars renown in the field of higher education assessment pedagogy. We were also limited by a word count that included the bibliography and therefore could not afford to expand our literature review to cover scholarship in the broader field that sits outside our area of focus.
Lastly, I have been a Wikipedia editor since 2012, and listed my courses on the Wikipedia:Education noticeboard [5] in May 2013, and in September and October 2013 (See for example [6]). I am listed as one of the University of Sydney Contacts on the Wikipedia Education Program’s page for Australia [7] and have worked on initiatives with Wikimedia AU, and received support from Wikipedia volunteers and Wikipedians at my institution. My coauthor, Rebecca Johinke, developed an interest in teaching with Wikipedia in 2013 and has used it in teaching since 2014. To be fair to the Wikimedia Foundation, their support has been invaluable, as has the support of our local chapter, Wikimedia AU. Frances Di Lauro 08:54, 4 April 2016 (UTC)