"Employing Wikipedia for good not evil" in education; using eyetracking to find out how readers read articles

Recent research

"Employing Wikipedia for good not evil" in education; using eyetracking to find out how readers read articles

"Employing Wikipedia for good not evil: innovative approaches to collaborative writing assessment"

Reviewed by Piotr Konieczny

This paper^[1] is a good example of how to write articles for the "teaching with Wikipedia" field. The authors report their positive experiences with several under- and postgraduate classes at the University of Sydney, developing articles such as pregnancy vegetarianism, Cleo (magazine) or Slave Labour (mural). They describe in relative detail a number of assignments and assessment criteria, and discuss benefits that their Wikipedia assignments have for the community (improving valuable and underrepresented content) and for the students themselves (improving their writing, research and collaborative skills). The paper could benefit from a more comprehensive literature review, however: while it describes a useful set of educational activities, and rather well at that, these are not groundbreaking—practically all activities discussed in this paper have been discussed in peer reviewed literature by others. Unfortunately, the authors fail to cite many of related works (I count only about five citations to the other peer-reviewed works from the much larger field of teaching with Wikipedia). Furthermore, the authors seem unaware of the Wikipedia:Education Program. It does not appear that any of their courses so far have been registered on Wikipedia; sadly they have no on-wiki homepage allowing identification of all edited articles or participating students; it is also unclear if the instructors themselves have a Wikipedia account. This suggests a failing both on the part of the researchers (they spent years reading about, researching and engaging with the teaching with Wikipedia approach without realizing there is a major support infrastructure in place to assist them), as well as on the part of the Wikipedia community and the Education Program itself, which is clearly still not being visible enough, nor active enough to identify and reach out to such educators who have been engaged in several years of ongoing teaching on Wikipedia. Hopefully in the future we can integrate those and other educators into our framework better.

Using eyetracking to find out how Wikipedia articles are being read

Reviewed by Tilman Bayer

Screenshot of eyetracking software (not from the papers discussed here)

Researchers from the University of Regensburg in Germany have used eyetracking methods to find out which article elements readers focus on while searching for information on Wikipedia, depending on the nature of the search task (factual information lookup, learning, or casual reading—a classification taken from a 2006 article^{[supp 1]} about exploratory search in general).

In two 2012 articles^[2]^[3] the researchers summarized the methodology and results of one of their lab experiments with 28 participants, which besides eyetracking also incorporated data from survey questionnaires, browser logs and electromyography for two facial muscles that indicate emotional reactions (the corrugator and the zygomaticus major). Among the results of this first study (see also a related paper in English with illustrations explaining the various article elements^[4]):

During lookup tasks, tables and graphical representations were preferred (but illustrative/decorative images were almost never looked at. As the authors point out, their test question, about the number of passengers on the Titanic, focused on textual information). On the other hand, "in 'learn' tasks users concentrate more on the introduction and lists. In the 'casual leisure' area, many different content elements are used." [this and other quotes have been translated from German]
Users tend to skim the article during lookup tasks, but read more text parts in the other tasks.
According to a post-task survey, user satisfaction in both the lookup and learn tasks was independent of the number of images.

A subsequent German-language PhD thesis ^[5] (see also 2012 conference poster) contains much more detail, e.g. reporting that in "lookup" tasks, readers spend >45% of their time on scanning the table of content and lists in the article, in "learn" tasks these only amount to <10% of the time.

A second PhD thesis, covered in a brief paper^[6] last year, examined for example which elements readers look at first within an article (from an experiment involving 163 German Wikipedia articles and 90 participants who were asked to prepare themselves for an course on the history of Bavaria in the 20th century, i.e. a "learning" overview task): The table of contents was the most frequent entry point (36%) followed by the lead section (31%) and the text body itself. The author observes further that "the article heading and images serve less often as entry point. The text heading [presumably the first section heading after the lead] and image captions very rarely occurs as points of first contact". Another publication^[7] by the same author focused on "users' interaction with pictorial and textual contents ...[ The spread] of information within the articles and the relation between text and images are analyzed. ... By now 30 articles have been analyzed according to this scheme. [Within these, there] are 639 contact points leading to images. Results show that 39% of all contact points lead from image to image, in mutual directions (previous or next). All text contact points [e.g. citations] sum up to a total of 37%. In 5% of all cases, an introduction triggers a saccade to an image. The remaining types of contact points occur rather rarely."

A later overview article^[8] summarizes other aspects in less detail, e.g:

More experienced readers used the table of contents less often.
Overall, search strategies did not differ a lot between the "learning" and casual reading ("non-work-based") tasks. But there were statistically significant differences to the information seeking behavior in fact lookup tasks. The largest differences concerned the consumption of text, images and TOC (cf. above). Readers also spent a larger ratio of time navigating compared to analyzing content.

(For an overview over other new data sources shedding light on how readers navigate within articles, see also this reviewer's recent tech talk at the Wikimedia Foundation, and a research overview page on Meta about the question "Which parts of an article do readers read?)

Other recent publications

An analysis used Wikipedia to rank Jimi Hendrix as the most influential rock guitarist

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.

"Political Advertising on the Wikipedia Market Place of Information"^[9] From the abstract: "Wikipedia’s popularity and reputation give politicians incentives to use it for enhancing their online appearance effectively and tailored towards their constituency. [...] we assemble data covering editing activity for articles on all 1,100 members of the German parliament (MPs) for the three last legislatures. We find editing to be a persistent phenomenon that is practiced by a substantial amount of MPs and is growing throughout election years."
"Identifying missing dictionary entries with frequency-conserving context models"^[10] From the abstract: "Upon training our model with the Wiktionary—an extensive, online, collaborative, and open-source dictionary that contains over 100,000 phrasal-definitions—we develop highly effective filters for the identification of meaningful, missing phrase-entries. With our predictions we then engage the editorial community of the Wiktionary and propose short lists of potential missing entries for definition, developing a breakthrough, lexical extraction technique, and expanding our knowledge of the defined English lexicon of phrases."
"Population automation: An interview with Wikipedia bot pioneer Ram-Man"^[11] From the abstract: ".... an in-depth interview with Wikipedia user Ram-Man, [...] creator or the rambot, the first mass-editing bot. Topics discussed include the social and technical climate of early Wikipedia, the creation of bot policies and bureaucracy, and the legacy of rambot and Ram-Man’s work."
"Mining Wikipedia to Rank Rock Guitarists"^[12]^{[predatory publisher]} From the abstract: "The influence of a guitarist was estimated by the number of guitarists citing him/her as an influence and the influence of the latter. [...] The results are most interesting and provide a quantitative foundation to the idea that most of the contemporary rock guitarists are influenced by early blues guitarists. Although no direct comparison exist, the list was still validated against a number of other best-of lists available online and found to be mostly compatible."
Predicting tennis players' Wikipedia popularity from tournament performance: From the abstract of a paper titled "Untangling Performance from Success":^[13] "We show that a predictive model, relying only on a tennis player's performance in tournaments, can accurately predict an athlete's popularity [as measured by Wikipedia pageviews], both during a player's active years and after retirement."
"Request for Adminship (RFA) within Wikipedia: How Do User Contributions Instill Community Trust?"^[14] From the abstract: "... we examine the impact of different forms of contribution made by adminship candidates on the community's overall decision as to whether to promote the candidate to administrator. To do so, we collected data on 754 RFA cases and used logistic regression to test four hypotheses. Our results supported the role of total contribution, and clarification of contribution in RFA success while the impacts of social contribution was partially supported and the role of content contribution was not supported. Also, both control variables (tenure and number of attempts) showed significant relationships with RFA success."
"Wikidata: A platform for data integration and dissemination for the life sciences and beyond"^[15] From the abstract: "Our group is [...] populating Wikidata with the seeds of a foundational semantic network linking genes, drugs and diseases. Using this content, we are enhancing Wikipedia articles to both increase their quality and recruit human editors to expand and improve the underlying data. We encourage the community to join us as we collaboratively create what can become the most used and most central semantic data resource for the life sciences and beyond."
"A matter of words: NLP for quality evaluation of Wikipedia medical articles"^[16] From the abstract: "We prove the effectiveness of our approach by classifying the medical articles of the Wikipedia Medicine Portal, which have been previously manually labeled by the Wiki Project team. The results of our experiments confirm that, by considering domain-oriented features, it is possible to obtain sensible improvements with respect to existing solutions, mainly for those articles that other approaches have less correctly classified."

References

^ Lauro, Frances Di; Johinke, Rebecca (2016-02-15). "Employing Wikipedia for good not evil: innovative approaches to collaborative writing assessment". Assessment & Evaluation in Higher Education. 42 (3): 478–491. doi:10.1080/02602938.2015.1127322. ISSN 0260-2938.
^ Knäusl, Hanna; Rösch, Barbara; Schubart, Lea (2012). "Einfluss von Kontextfaktoren auf das Suchverhalten in der Wikipedia". Information - Wissenschaft & Praxis. 63 (5): 319–323. doi:10.1515/iwp-2012-0061. ISSN 1619-4292. ("Detecting context factors in Wikipedia search tasks", in German with English and French abstracts)
^ Knäusl, Hanna; Ludwig, Bernd (2012). "Assessing the Relationship Between Context, User Preferences, and Content in Search Behaviour". Proceedings of the 5th Ph.D. Workshop on Information and Knowledge. PIKM '12. New York, NY, USA: ACM. pp. 67–74. doi:10.1145/2389686.2389700. ISBN 9781450317191. author's copy
^ Knäusl, Hanna; Ludwig, Bernd (2013-03-06). What Readers Want to Experience: An Approach to Quantify Conversational Maxims with Preferences for Reading Behaviour. 6th International Conference on Agents and Artificial Intelligence. Angers, Loire Valley, France. pp. 478–481.
^ Knäusl, Hanna (2014-12-18). Situationsabhängige Rezeption von Information bei Verwendung der Wikipedia (Thesis). University of Regensburg. doi:10.5283/epub.31041. (in German, with English abstract)
^ Rösch, Barbara (2015). "Wie interagieren Nutzer mit Text- und Bildinformationen in einem Wikipedia-Artikel?". Information - Wissenschaft & Praxis. 66 (1): 17–21. doi:10.1515/iwp-2015-0008. ISSN 1619-4292.
^ Rösch, Barbara (2014). "Investigation of Information Behavior in Wikipedia Articles". Proceedings of the 5th Information Interaction in Context Symposium. IIiX '14. New York, NY, USA: ACM. pp. 351–353. doi:10.1145/2637002.2637062. ISBN 978-1-4503-2976-7.
^ Knäusl, Hanna (2015). "Information Behavior – Informationssuche in der Wikipedia". Information - Wissenschaft & Praxis. 66 (1): 10–16. doi:10.1515/iwp-2015-0016. ISSN 1619-4292. (in German, with English abstract)
^ Göbel, Sascha; Munzert, Simon (2016-01-22). Political Advertising on the Wikipedia Market Place of Information. Rochester, NY: Social Science Research Network. SSRN 2720141.
^ Williams, Jake Ryland; Clark, Eric M.; Bagrow, James P.; Danforth, Christopher M.; Dodds, Peter Sheridan (2015-03-06). "Identifying missing dictionary entries with frequency-conserving context models". arXiv:1503.02120 [cs.CL].
^ Livingstone, Randall M. (2016-01-09). "Population automation: An interview with Wikipedia bot pioneer Ram-Man". First Monday. 21 (1). doi:10.5210/fm.v21i1.6027. ISSN 1396-0466.
^ Siddiqui, Muazzam A. (2015-11-08). "Mining Wikipedia to Rank Rock Guitarists". International Journal of Intelligent Systems and Applications. 7 (12): 50–56. doi:10.5815/ijisa.2015.12.05. ISSN 2074-904X.
^ Yucesoy, Burcu; Barabási, Albert-László (2015-12-02). "Untangling Performance from Success". arXiv:1512.00894 [physics.soc-ph].
^ Kreider, Christopher; Kordzadeh, Nima (2015-01-01). "Request for Adminship (RFA) within Wikipedia: How Do User Contributions Instill Community Trust?". SAIS 2015 Proceedings.
^ Mitraka, Elvira; Waagmeester, Andra; Burgstaller-Muehlbacher, Sebastian; Schriml, Lynn M.; Su, Andrew I.; Good, Benjamin M. (2015-11-16). "Wikidata: A platform for data integration and dissemination for the life sciences and beyond". bioRxiv 10.1101/031971.
^ Cozza, Vittoria; Petrocchi, Marinella; Spognardi, Angelo (2016-03-07). "A matter of words: NLP for quality evaluation of Wikipedia medical articles". arXiv:1603.01987 [cs.IR].

Supplementary references:

^ Marchionini, Gary (April 2006). "Exploratory Search: From Finding to Understanding". Commun. ACM. 49 (4): 41–46. doi:10.1145/1121949.1121979. ISSN 0001-0782. Online copy

← Previous "Recent research"

Next "Recent research" →

In this issue

1 April 2016 (all comments)

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

I read with interest: "During lookup tasks, tables and graphical representations were preferred (but illustrative/decorative images were almost never looked at. As the authors point out, their test question, about the number of passengers on the Titanic, focused on textual information)." How do I correctly quote the first half, when the bracket is closed only after the second? I tried here. --Gerda Arendt (talk) 07:10, 2 April 2016 (UTC)[reply]

This is a translation/summary/paraphrase from German, as noted elsewhere in the review. For reference, here is the corresponding text in the German original [1], with surrounding sentences:

"Werden bei Look up eher Tabellen und grafische Darstellungen genutzt, konzentriert sich der Nutzer bei Learn-Aufgaben mehr auf Einleitung und Listen. ... Bei den Aufgabentypen Learn und Casual-Leisure werden Bilder sehr viel häufiger verwendet. Hier geht es um die Aneignung von Wissen, nicht aber um das Nachschlagen ganz konkreter Informationsstücke (siehe Look up). Bilder stellen dabei eine zusätzliche Informationsquelle oder eine Visualisierung des Textes dar (Bsp. Spielfeld des Spiels Lacrosse). Beim TaskTyp Look up werden Bilder fast nie betrachtet."

Back to your question: If you just want to the English text from the review, how about simply using an ellipsis ;) - "During lookup tasks, tables and graphical representations were preferred (but illustrative/decorative images were almost never looked at. [...])."

Alternatively, ref 4 contains some of the same observations in English [2]. Quoting alongside other interesting bits:

"In learn tasks, users prefer headlines for scan actions, presumably as headlines are better for detecting the topic of the focussed section of the page than text passages. Furthermore, in casual-leisure tasks introductions are much more important than in learn tasks – from that observation we conclude again that in learn tasks users do not feel a need to find out whether a web page could be interesting to them, but are seeking a certain pieces of information. Interestingly with pictures it is the other way round. We assume that users need textual information in order to understand what the web page is about. In learn tasks however, they prefer to get information from any (type of) content element available."

"In lookup tasks, we observe that – contrary to both other task types – users prefer content elements that provide quick access to information. They focus on introductions, charts, tables, and lists more often than on text passages, which are just scanned briefly. The same observation holds for look at actions which are applied to charts and tables only – again elements facilitating quick access to small pieces of information. Finally, in contrast to both other task types, in lookup tasks we observed a very high proportion of navigate actions [...]"

It is not stated explicitly in that passage that pictures are not looked at in lookup tasks. But can be inferred from Table 1 there (the content element "BI" for images is not present in the actions recorded for lookup tasks).

Regards, Tbayer (WMF) (talk) 08:47, 2 April 2016 (UTC)[reply]

PS: I see you are interested in infobox usage as well. Their use was tracked in the experiment too, see the "IB" columns in table 1 in ref 4 (with the caveat that some of the other publication cited in the review may contain later results based on more data). Regards, Tbayer (WMF) (talk) 08:53, 2 April 2016 (UTC)[reply]

I am one of the alleged "infobox warriors" who try to make data more accessible, especially now that Persondata is deprecated, - thus making life on Wikipedia hard for editors who like a plain picture as aesthetically more pleasing. See Busoni, 150 years yesterday. The topic is also up for arbcom clarification. --Gerda Arendt (talk) 10:00, 2 April 2016 (UTC)[reply]

I also had two of my own academic articles on Wikipedia published in March ([3], [4]), but for obvious COI reasons I am not reviewing them. If anyone enjoying this newsletter would, however, like to return the favor and review my works (feel free to be critical), I'd appreciate it :) --_{Piotr Konieczny aka Prokonsul Piotrus| reply here} 04:29, 4 April 2016 (UTC)[reply]

Thank you for your constructive feedback on our article, “Employing Wikipedia for good not evil: innovative approaches to collaborative writing assessment.” We do appreciate you commenting on the article but some of your comments, though understandable, do not take into account the audience that this article seeks to address.

I hope that the following responses address some of your concerns, which centre on the inclusion of only five works from the larger corpus of “teaching with Wikipedia” literature, the fact that the paper does not report on any “groundbreaking” activities, and that the authors don’t seem to have connections with major support structures, don’t seem to have a Wikipedia account, and so forth.

As our purpose was to promote the efficacy of using Wikipedia to assess students’ performance, we wrote our article for scholars interested in higher education assessment pedagogy and so our focus was quite narrow. The article passed peer review and was accepted by a respected higher education journal that has only every published one article on the use of Wikipedia in Education: ours. A keyword search shows that of the 6 others, 3 mentioned Wikipedia twice, 3 mentioned it 4 times and only one of those did not reinforce common negative perceptions about Wikipedia.

Perhaps not groundbreaking, it was nonetheless a breakthrough to have such a high-ranking journal in the field accept our article for publication, particularly as we were advocating something that is not appealing to many conservative academics. In order to be accepted by such a high ranking journal, it was essential for us to show the ways our practices hinge on the theories of scholars renown in the field of higher education assessment pedagogy. We were also limited by a word count that included the bibliography and therefore could not afford to expand our literature review to cover scholarship in the broader field that sits outside our area of focus.

Lastly, I have been a Wikipedia editor since 2012, and listed my courses on the Wikipedia:Education noticeboard [5] in May 2013, and in September and October 2013 (See for example [6]). I am listed as one of the University of Sydney Contacts on the Wikipedia Education Program’s page for Australia [7] and have worked on initiatives with Wikimedia AU, and received support from Wikipedia volunteers and Wikipedians at my institution. My coauthor, Rebecca Johinke, developed an interest in teaching with Wikipedia in 2013 and has used it in teaching since 2014. To be fair to the Wikimedia Foundation, their support has been invaluable, as has the support of our local chapter, Wikimedia AU. Frances Di Lauro 08:54, 4 April 2016 (UTC)

What do you think of The Signpost? Share your feedback.

Home

About