The Signpost

Recent research

How censorship can backfire and conversations can go awry

Contribute  —  
Share this
By Tilman Bayer, Bri, Barbara Page, and Maik Stührenberg

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

"On the Self-similarity of Wikipedia Talks: a Combined Discourse-analytical and Quantitative Approach"

Reviewed by Maik Stührenberg

This paper[1] is thoroughly structured and combines the theory of web genres with dialogue theory to examine Wikipedia talk pages. Since Wikipedia is a web genre, "Wikicussions" (as the authors call them) form a subgenre. In this context, talk pages are examined further, including the quality of cooperation between Wikipedia users, that can be linked to social differentiation regarding roles and statuses of Wikipedians (content- vs. administration-related users). These group-related processes can be seen as a mediating layer between external parameters (system requirements for Wikipedia's user community) and the structure and dynamics of WP's subgenres.

Unlike face-to-face dialogue, the authors argue that Wikicussions stand out due to a publicly available common ground (derived from dialogue theory), which may provide a reason for the structures they found.

The paper is enriched with a number of high-quality figures that support and underpin the findings.

Graph between November 2000 and November 2015 clearly demonstrating that most posts come from registered users
Frequency distribution of talk posts over time within the German Wikipedia (blue: registered users; red: anonymous users; green: bots; black: all users). Unsigned posts (without timestamps) are excluded. Posts dated by posters outside of the valid time-frame (before the date of creation of the discussion or after the date of its download) are also excluded. (Figure 7 from the paper)

"How Sudden Censorship Can Increase Access to Information"

Reviewed by Bri and Tilman Bayer

Our intuition might tell us that government censorship causes reduced access to online information. But recent research indicates that the effect can be exactly the opposite. Using data gathered from Wikipedia page views and other sources, researchers William Hobbs and Margaret Roberts found that:

Specifically, the authors studied the impact of a block of Instagram in China on September 29, 2014, following protests in Hong Kong, on Chinese Wikipedia pages that were already blocked in the country. (This predates the 2015 total block of the Chinese Wikipedia and the switch of all Wikimedia sites to full encryption with HTTPS around the same time, which made such per-page blocking impossible.) The censored Chinese Wikipedia pages with the largest increase in views "shows that new viewers accessed pages that had long been censored including those related to the 1989 Tiananmen Square protests",[2]: 12  i.e. "viewing patterns that would be more typical of new users who had just jumped the firewall, rather than of old VPN users who had presumably consumed this information long ago."[2]: 11  Here is an excerpt of the full list examined in the research, the top 10 for the second day of the block, linked here to their English Wikipedia equivalents:

  1. People's Republic of China blocked websites list
  2. Jiang Zemin
  3. Radio Australia
  4. Hu Jintao
  5. Zeng Qing
  6. Wang Weilin (Tank Man)
  7. Li Peng
  8. Tiananmen Square Incident
  9. Zhou Yongkang
  10. Wu'erkaixi (June 4 leader)

The researchers propose to name this phenomenon the "gateway effect", a "mechanism through which repression can backfire inadvertently, without political or strategic motivation",[2]: 3  because it incentivizes people to learn how to evade censorship and thus "have more, not less, access to information and begin engaging in conversations, social media sites, and networks that have long been off-limits to them."[2]: 15  They distinguish it from the Streisand effect, where individuals specifically seek out information that is being hidden.

The second author of the study, Margaret Roberts, is also the author of Censored: Distraction and Diversion Inside China's Great Firewall (Princeton University Press, 2018; print ISBN 978-0-691-17886-8, e-book 978-1-400-89005-7).

Marketing, social media, and Wikipedia

Reviewed by Barbara Page

This study was able to "characterize" the interests of Wikipedia editors and the editors' social media activity on Twitter to facilitate:

Photograph of person's left hand holding a smartphone that is accessing social media
A marriage between editor editing topics and Twitter (and possibly Facebook) will result in targeted marketing tailored just for you!

Conferences and events

See the community-curated research events page on Meta-wiki for other upcoming conferences and events, including submission deadlines.

WMF research showcase

Recent presentations at the monthly Research showcase hosted by the Wikimedia Foundation included the following:

"Conversations Gone Awry: Detecting Early Signs of Conversational Failure"
PDF of "Conversations Gone Awry" with first page depicted
Presentation slides (video)

Antisocial behavior can exist in online social systems and may include harassment and personal attacks. A new paper[4] by seven researchers from Cornell University, Jigsaw, and the Wikimedia Foundation describes how the prediction of undesirable negative exchanges may be able to prevent the deterioration of a discussion. Prediction may be possible at the start of a conversation to prevent its deterioration. One of the authors also gave an interview published on the Wikimedia Foundation's blog,[supp 1] and the paper was covered in popular media; see In the media § In brief.

Case studies in the appropriation of ORES

From the announcement (by Aaron Halfaker):

PDF of "ORES appropriation and reflection" with first page depicted
Presentation slides about the use of the ORES platform (video)

The presentation covered "three key tools that Wikipedians have developed that make use of ORES": Wikidata's damage detection models, exposed through Recent Changes; Spanish Wikipedia's PatruBOT; and WikiEdu tools from User:Ragesoss that incorporate article quality models.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions are always welcome for reviewing or summarizing newly published research.

Compiled by Tilman Bayer


  1. ^ Mehler, Alexander; Gleim, Rüdiger; Lücking, Andy; Uslu, Tolga; Stegbauer, Christian (January 30, 2018). "On the Self-similarity of Wikipedia Talks: a Combined Discourse-analytical and Quantitative Approach" (PDF). Glottometrics. 40. RAM-Verlag (published January 2018): 1–45. ISSN 1617-8351. OCLC 7493144471. Archived (PDF) from the original on June 28, 2018. Retrieved June 28, 2018 – via ResearchGate. {{cite journal}}: External link in |volume= (help) Open access icon
  2. ^ a b c d e Hobbs, William R.; Roberts, Margaret E. (April 2, 2018). "How Sudden Censorship Can Increase Access to Information". American Political Science Review. Cambridge University Press: 1–16. doi:10.1017/S0003055418000084. eISSN 1537-5943. ISSN 0003-0554. OCLC 7435466814. Closed access icon
  3. ^ Torrero, Christian; Caprini, Carlo; Miorandi, Daniele (April 9, 2018). "A Wikipedia-based approach to profiling activities on social media". p. 1. arXiv:1804.02245v2 [cs.IR].
  4. ^ Zhang, Justine; Chang, Jonathan P.; Danescu-Niculescu-Mizil, Cristian; Dixon, Lucas; Yiqing, Hua; Thain, Nithum; Taraborelli, Dario (May 14, 2018). "Conversations Gone Awry: Detecting Early Signs of Conversational Failure". arXiv:1805.05345v1 [cs.CL].
  5. ^ Klapper, Helge; Reitzig, Markus (May 7, 2018). "On the Effects of Authority on Peer Motivation: Learning from Wikipedia" (PDF). Strategic Management Journal. John Wiley & Sons. doi:10.1002/smj.2909. eISSN 1097-0266. OCLC 7586436764. Retrieved June 28, 2018. Open access icon
  6. ^ Shang, Wenyi (March 15, 2018). "A Comparison of the Historical Entries in Wikipedia and Baidu Baike". In Chowdhury, Gobinda; McLeod, Julie; Gillet, Val; Willett, Peter (eds.). Transforming Digital Worlds. International Conference on Information (iConference 2018; March 25–28 at Sheffield, United Kingdom). Lecture Notes in Computer Science. Vol. 10766 (Online ed.). Cham, Switzerland: Springer International Publishing AG. pp. 74–80. doi:10.1007/978-3-319-78105-1_9. ISBN 978-3-319-78105-1. OCLC 7357407865. Closed access icon
  7. ^ Xiao, Lu; Sitaula, Niraj (March 15, 2018). "Sentiments in Wikipedia Articles for Deletion Discussions". In Chowdhury, Gobinda; McLeod, Julie; Gillet, Val; Willett, Peter (eds.). Transforming Digital Worlds. International Conference on Information (iConference 2018; March 25–28 at Sheffield, United Kingdom). Lecture Notes in Computer Science. Vol. 10766 (Online ed.). Cham, Switzerland: Springer International Publishing AG. pp. 81–86. doi:10.1007/978-3-319-78105-1_10. ISBN 978-3-319-78105-1. OCLC 7357407963. Closed access icon
  8. ^ Pentzold, Christian (May 3, 2017). "'What are these researchers doing in my Wikipedia?': ethical premises and practical judgment in internet-based ethnography" (PDF). Ethics and Information Technology. 19 (2). Springer Science+Business Media (published May 5, 2017): 143–155. doi:10.1007/s10676-017-9423-7. eISSN 1572-8439. ISSN 1388-1957. OCLC 7039749181. Archived (PDF) from the original on June 28, 2018. Retrieved June 28, 2018 – via Free access icon
  9. ^ Pentzold, Christian; Weltevrede, Esther; Mauri, Michele; Laniado, David; Kaltenbrunner, Andreas; Borra, Erik (March 13, 2017). Scopigno, Roberto (ed.). "Digging Wikipedia: The Online Encyclopedia as a Digital Cultural Heritage Gateway and Site" (PDF). Journal on Computing and Cultural Heritage. Special Issue on Digital Infrastructure for Cultural Heritage, Part 1. 10 (1). New York: Association for Computing Machinery (published April 14, 2017): 5:1–5:19. doi:10.1145/3012285. eISSN 1556-4711. ISSN 1556-4673. OCLC 7006965721. Retrieved June 28, 2018 – via ResearchGate. Free access icon
  10. ^ Kelly, Elizabeth Joan (November 28, 2017). "Use of Louisiana's Digital Cultural Heritage by Wikipedians". Practical Communication. Journal of Web Librarianship. 12 (2). Taylor & Francis: 85–106. doi:10.1080/19322909.2017.1391733. eISSN 1932-2917. ISSN 1932-2909. OCLC 7566358637. Closed access icon
  11. ^ Yamada, Shohei (December 29, 2017). "The Conceptual Correspondence between the Encyclopaedia and Wikipedia". Journal of Japan Society of Library and Information Science. 63 (4). Japan Society of Library and Information Science: 181–195. doi:10.20651/jslis.63.4_181. eISSN 2432-4027. ISSN 1344-8668. OCLC 7261862873. Closed access icon
  12. ^ Matei, Sorin Adam; Britt, Brian C. (September 21, 2017). "Analytic Investigation of a Structural Differentiation Model for Social Media Production Groups". In Alhajj, Reda; Glässer, Uwe (eds.). Structural Differentiation in Social Media: Adhocracy, Entropy, and the '1 % Effect'. Lecture Notes in Social Networks (1st ed.). Cham, Switzerland: Springer Nature. pp. 73, 75. doi:10.1007/978-3-319-64425-7_5. eISSN 2190-5436. ISBN 978-3-319-64424-0. ISSN 2190-5436. LCCN 2017948031. OCLC 7138124671.
Supplementary references:
  1. ^ Zhang, Justine; Chang, Jonathan (June 13, 2018). "'Conversations gone awry'—the researchers figuring out when online conversations get out of hand". Wikimedia Blog (Interview). Interviewed by Melody Kramer; Dario Taraborelli. Wikimedia Foundation. Archived from the original on June 28, 2018. Retrieved June 28, 2018.
  2. ^ Bush, Jim (November 6, 2017). "Results of Wikipedia study may surprise". Purdue News Service and Agricultural Communications (Press release). West Lafayette, Indiana: Purdue University. OCLC 7177119166. Archived from the original on June 28, 2018. Retrieved June 28, 2018.
In this issue
+ Add a comment

Discuss this story

The sample size of RfA is so small in recent years (since 2012) that it would not produce any usable results. The only major change in that time is that the RfAs have slowly warped into yet another platform for a lot discussion about the process and adminship in general. RfA remains the Wild West of Wikipedia. Kudpung กุดผึ้ง (talk) 00:35, 3 July 2018 (UTC)[reply]
Whatt other result would be possible except that positive expressions correlate with desires to keep? What classes of arguments for keep are there except that the subject is notable/the article is good/that it does meet policy? Or, for delete, that the subject is not notable/the article is not good/the article does not meet policy? I don't see how any of this could affect judging the quality of closes, especially considering closes aren't supposed to be a mere numerical count of votes. It would identify those closes where the closes did not match the sentiments most expressed, but that's not an indication that the close is bad--in fact, it's the usual situation for AfDs contaminated by single purpose accounts. (and similarly for RfAs) DGG ( talk ) 00:32, 6 July 2018 (UTC)[reply]
One of the many reasons why I think it'd be a Bad Idea™ to do so. Regardless, even though it's unsurprising, it's noteworthy that they can actually detect a difference. ~ Amory (utc) 01:01, 6 July 2018 (UTC)[reply]
Also, I thought a Wikicussion was what you get from beating your head against the wall arguing with someone who just doesn't get it. EEng 14:47, 5 July 2018 (UTC)[reply]


The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0