English as Wikipedia's lingua franca; deletion rationales; schizophrenia controversies

Recent research

English as Wikipedia's lingua franca; deletion rationales; schizophrenia controversies

English still the lingua franca of Wikipedia

Reviewed by Morten Warncke-Wang

Many of the more active Wikipedia contributors are multilingual. In the April 2011 Wikipedia Editors Survey,^{[supp 1]} 72% of respondents said they read Wikipedia content in more than one language, and 51% said they contributed to multiple Wikipedias. Research has estimated that approximately 15% of active Wikipedians are multilingual.^{[supp 2]} These contributors are important as they can enable knowledge transfer between different language editions of Wikipedia, yet little is known about who they are and what they do.

A recent paper published in PLOS ONE by researchers at KAIST and OII, titled "Understanding Editing Behaviors in Multilingual Wikipedia"^[1], adds to our knowledge of multilingual contributors by investigating their engagement level, topic interests, and language proficiency. The paper uses a dataset spanning a month of Wikipedia contributions in July–August 2013 and defines a multilingual editor as one who make contributions to multiple languages. Overall the dataset contains 12,577 multilingual editors, of which 77.3% are bilingual, 11.4% trilingual, and 4.1% quadrilingual.

Out of Wikipedia's (now) 288 language editions, the paper focuses on three: English, German, and Spanish. These three languages were chosen because the paper utilizes natural language processing to estimate language proficiency, and the tools available in those languages are sufficiently developed. The multilingual editors are divided into two groups: primary editors, consisting of the contributors who make most of their edits to a certain language edition, and non-primary editors. These two groups are then compared in terms of their engagement, topic interests, and language proficiency.

To measure editor engagement, consecutive edits by the same editor to the same article are collapsed into edit sessions.^{[supp 3]} T-tests are used to compare primary and non-primary editors on several measures: number of edits per session, session length, amount of content added (number of characters or tokens such as words), and whether non-visible changes are made. The results show that primary editors are more engaged as they commit more edits, have longer sessions, add more content, and are more likely to make visible edits compared to non-primary editors.

Editor interests are identified using a combination of LDA and DBSCAN to create a set of 20 topic clusters for each language. These topic clusters are then labelled by humans, resulting in cluster labels such as “Science” and “Global Sports”. Primary and non-primary editors are found to be generally interested in the same topics, but some significant differences show up. For instance, non-primary editors are contributing more to articles about cities in English, soccer in German, and plants in Spanish. Primary editors are, on the other hand, more interested in, for example, computers in English and German, and politicians and entertainment in Spanish.

Lastly the paper studies the language complexity of contributions by primary and non-primary editors. Several measures of language complexity from the literature are used, for example entropy of parts-of-speech unigrams, bigrams, and trigrams, as well as whether articles (in English: the, a, an) are used correctly. Because different topics use language differently – for instance fact-oriented topics such as sports show lower language complexity compared to more conceptual topics such as history – both intra-topic complexity as well as inter-topic complexity is controlled for. Primary editors are found to use more diverse terms and edit more complex parts of articles compared to non-primary editors across all three languages. However, English differs from German and Spanish when it comes to linguistic proficiency of the edits made. In German and Spanish, primary editors display higher linguistic proficiency compared to the non-primary editors, whereas in English there is no noticeable difference.

Taken together, the results indicate how language continues to be a barrier to entry, seeing how non-primary editors are less engaged and make less complex contributions. The findings also point to how English continues to be a hub language in Wikipedia: It has the lowest proportion of primary editors with 32.9%, compared to German’s 49.9%. (In this context, the authors mention a 2012 WikiSym paper^{[supp 4]}, co-authored by this reviewer, which found that English was by far the most-used language to translate from – as measured by translation template usage – and discussed how English Wikipedia thereby could be used as a hub.) At the same time, multilingual Wikipedians are important in helping move content across languages, as exemplified by the Wikimedia Foundation’s development of a tool to recommend articles for translation.^{[supp 5]} As mentioned in the paper’s conclusion, when it comes to multilingual Wikipedians there are still many questions left, although this paper makes significant contributions by answering some of them.

A new algorithmic tool for analyzing rationales on articles for deletion

Reviewed by Steve Jankowski

This article^[2] is a report on one component of a longitudinal study of how "rationales" are utilized by Wikipedians on articles for deletion (AfD) to direct collaboration. In order to arrive at conclusions about the role of rationales in decision-making processes, the author has approached the research object from a number of angles. Previously the researcher had conducted an exploratory content analysis of rationales. This research was subsequently followed by interviews of Wikipedians. The current research describes the process of developing an algorithmic tool that will be able to analyze large data sets for "directive rationales". The author admits that AfD discussions are predisposed to this kind of analysis due to the predictable order of comments that describe an action and a rationale for the action. Decision-making of this sort substantially differs from the style of discussion for the rest of Wikipedia's talk pages. Regardless of this limitation, the author concludes that further research into rationales will provide insights into how it functions to connect policies with practices. Given the breadth of research methods of the project, it will be interesting to see what conclusions the author comes to when the project concludes.

Controversy goes online: schizophrenia genetics on Wikipedia

Reviewed by Piotr Konieczny

This paper^[3] addresses the area of scientific knowledge creation online, as well as the notion of controversy, by examining the editing history and discussion about English Wikipedia pages on schizophrenia and its subpage, causes of schizophrenia. The specific controversy authors focused on is that of genetic basis for schizophrenia (a topic which the authors note is still debated by scholars and on which there is no consensus). The authors commend the neutrality of the lead of the Wikipedia article ("The causes of schizophrenia have been the subject of much debate, with various factors proposed and discounted or modified...") and ask "How are such statements constructed, or in other words, what is the work which goes into making these claims?" The authors used a dataset from August 2006 to October 2011 (20,000 words of talk text and 13,000 words of article text) to investigate how this topic is presented and contested in Wikipedia.

The authors make a number of interesting observations. They observe that editors are not equal, and in addition to the usual admin>user>anon>bot hierarchy, they noted that "'who you are’ is important when it comes to editing the schizophrenia article...". Many editors self-identified as living with schizophrenia or as medical experts. The talk pages are policed to keep the discussion on discussing article's contents, and anecdotes and personal experience stories are discouraged, or even removed from the pages. WP:V and WP:OR are certainly enforced as well, and Wikipedians will be pleased to note their observation that "Priority is always given to the published scientific literature." However, there are also a number of problems. Not all contributors have access to paywalled, quality content, and some seemingly rely only on article abstracts.

Some low quality references slip through the net, and standards are not enforced consistently ("Attention to the reference list in the schizophrenia article at the time of our study revealed numerous citations that were not reviews", but original research academic papers about "breakthroughs" – this mentioned in the context of a talkpage argument that "such papers should be avoided until their findings are confirmed"). The authors also note that they found at least "one reference to another Wikipedia article and also to a schizophrenia forum discussion". The article's structure is a result of years of minor edits with little attention to the big picture, resulting in occasionally illogical and incoherent layout with some contradictions or clearly obsolete but not updated sections, which leads the authors to summarize the state of the article as "a rather ad hoc assemblage of resources" and "a chronological patchwork of studies that nonetheless does have the effect of synthesising knowledge". Despite those problems, they conclude that the Wikipedia article, and the creation process behind it, is similar to an academic review article. Also, despite Wikipedia's claims that it is simply describing the state of things, rather than creating new arguments or points of view, the authors do think that the Wikipedia article is also an active voice in ongoing discussions, and note that some editors on the talk page see the purpose of the article as educating the public as well as some experts.

There are some unfortunate omissions (through to some degree understandable due to academic publish word limit). The authors do not discuss in detail whether some users, such as experts, seem to pull more weight in the discussions, or whether removal of personal stories impacts the friendliness of the discussion. Despite these omissions, the paper is an interesting analysis of knowledge creation on Wikipedia, as well as another contribution to the ongoing discussion about the reliability and quality of Wikipedia. On that note, it is worth noting that schizophrenia is a Featured Article, following a 2003 nomination that by today's FA standards is more like a joke. Given the criticism of the article's 2011 version as voiced by this paper, the community may want to consider a Feature Article Review here.

Evaluating link-based recommendations for Wikipedia

Reviewed by Morten Warncke-Wang

Co-citation graphs (networks of who cites whom) are frequently used to recommend books and articles, but how well does links between Wikipedia articles work for this purpose? A paper^[4] to be published at the upcoming Joint Conference on Digital Libraries evaluates this by comparing the performance of co-citation with and without proximity analysis against the commonly used “More Like This” (MLT) text-based approach found in Apache Lucene. The paper’s main finding is that co-citation with proximity analysis (CPA) performs comparably to MLT, but that the two methods have different strengths: MLT is good at identifying closely related articles, while CPA is better at finding broader ones and will identify more popular articles that typically are of higher quality. These results suggest a hybrid approach might be best suited for finding related articles in Wikipedia, something the authors plan to study in future work.

Briefly

"Bridging the gap between Wikipedia and academia"

Reviewed by Piotr Konieczny

This paper^[5] in JASIST from April this year is a brief opinion piece summarizing perceptions of Wikipedia in academia. It provides a short literature review of works that discuss this subjects, summarizes the research on Wikipedia's reliability (still a concern among many scholars), notes the spread of the use of Wikipedia as a teaching assignment in colleges, acknowledges the general widespread use of Wikipedia by the public, and in the paper's own words, calls "for a peaceful coexistence". A more detailed take on those very subjects is presented by the very same journal in March^[6] (disclaimer: the latter article is written by this reviewer).

Troublesome tools: how can Wikipedia editing enhance student teachers' digital skills?

Reviewed by Federico Leva

Yet another small university class (<20 first-year university students) has independently tried Wikipedia editing and tells its story.^[7]

The students were told to edit an article and succeeded; while doing so, they improved their information literacy, digital literacy and trust in the Wikipedia system. On the other hand, the exercise itself was not sufficient to make them understand in depth the dynamics and principles of Wikipedia, nor to integrate in the community.

In the opinion of this reviewer, the article makes for a nice blog post to be shared with university professors belonging to other Nordic countries as well as similar disciplines. The experience also confirms that university professors can and should use Wikipedia as a teaching tool, but can improve results if they contact expert Wikimedians (usually via a local Wikimedia chapter) to actually introduce the students to the spirit and dynamics of the Wikimedia projects.

Wikipedia as a tool for 21st-century teaching and learning

Reviewed by Federico Leva

Short opinion piece from the University of Wisconsin-River Falls supporting the usage of Wikipedia as teaching tool to improve information literacy.^[8] Under the guise of a literature review, the author mentions 4 past experiments of usage of the wikis in the classroom, published between 2006 and 2009.

Factors that influence the teaching use of Wikipedia in higher education

Reviewed by Federico Leva

According to this 2012 survey of 800 professors belonging to the Universitat Oberta de Catalunya, professors mostly agree with the usage of Wikipedia as an "open repository" to dissiminate research and a growing number of them approves of its usage as a teaching tool. At the time of the survey, however, most professors were still waiting to be convinced by their colleagues.^[9] See also the longer review of the paper's preprint version^{[supp 6]} in our December 2014 issue: "Use of Wikipedia in higher education influenced by peer opinions and perception of Wikipedia's quality"

Analysing temporal evolution of interlingual Wikipedia article pairs

Reviewed by Morten Warncke-Wang

A paper^[10] to be published at the forthcoming SIGIR 2016 conference as part of their demonstration track describes MultiWiki (demo available online), a tool that calculates similarities and differences between pairs of articles in different Wikipedia languages. The tool then visualises these using a timeline, a map, and by displaying article texts side-by-side. Visualising similarities and differences between Wikipedia languages is not a new idea^{[supp 7]} ^{[supp 8]}, but this tool is the first to show textual alignment.

Other recent publications

A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.

"Accidental technologist: how can libraries improve Wikipedia?"^[11] From the abstract: "Wikipedia and libraries got off to a strained start. Perhaps this is only my perception, but it appeared that Wikipedia was used as a defenseless punching bag in much information literacy instruction."
"The implications of Wikipedia for contemporary science education: using social network analysis techniques for automatic organisation of knowledge"^[12] From the abstract: "In this paper we analyze a complete copy of the Spanish Wikipedia. We apply Social Networks Analysis Techniques and, more precisely, Communities Detection Techniques, in order to identify clusters of articles with similar content. [...] We conclude that science articles are about 11.66 % of Spanish Wikipedia articles and that the most important clusters of scientific articles do not always coincide with classical Science disciplines."
"'Did i say something wrong?' A word-level analysis of Wikipedia articles for deletion discussions"^[13] [sic] From the abstract: "This thesis focuses on gaining linguistic insights into textual discussions on a word level. It was of special interest to distinguish messages that constructively contribute to a discussion from those that are detrimental to them. Thereby, we wanted to determine whether 'I- and 'You'-messages are indicators for either of the two discussion styles. [...] we used Wikipedia Articles for Deletion (short: AfD) discussions together with the records of blocked users and developed a fully automated creation of an annotated data set. [...] We found that 'You'-messages were a strong indicator for disruptive messages which matches their attributed effects on communication. However, we found I'-messages to be indicative for disruptive messages as well which is contrary to their attributed effects."
"Wikipedia: the difference between information acquisition and learning knowledge"^[14] From the abstract: "This paper attempts to define Wikipedia in an information literacy context by providing an analysis of learning knowledge and Wikipedia’s structure."
"Mapping bilateral information interests using the activity of Wikipedia editors"^[15] From the abstract: "... we devise a scalable statistical model that identifies countries with similar information interests and measures the countries’ bilateral similarities. From the similarities we connect countries in a global network and find that countries can be mapped into 18 clusters with similar information interests. Through regression we find that language and religion best explain the strength of the bilateral ties and formation of clusters."
"Extracting semantics from unconstrained navigation on Wikipedia"^[16] From the abstract: "... we adapt a state of the art approach to extract semantic relatedness on Wikipedia paths. We apply this approach to transitions derived from two unconstrained navigation datasets as well as transitions from WikiGame and compare the results based on two common gold standards. [...] Overall, we show that unconstrained navigation data on Wikipedia is suited for extracting semantics."
" LlamaFur: learning latent category matrix to find unexpected relations in Wikipedia"^[17] From the abstract: "we focus on finding "unexpected links" in hyperlinked document corpora when documents are assigned to categories. "
"The lexicographic process of the German Wiktionary"^[18] (in German, see also author's website and our coverage of previous publications on Wiktionary by the same authors)
"Digital divisions of labor and informational magnetism: mapping participation in Wikipedia"^[19] From the abstract: "Our regression analysis shows that the availability of broadband is a clear factor in the propensity of people to participate on Wikipedia. However, the relationship is not a linear one. As a country approaches levels of connectivity above about 450,000 broadband Internet connections, the ability of broadband access to positively affect participation keeps increasing. Complicating this issue is the fact that participation from the world’s economic peripheries tends to focus on editing about the world’s cores rather than their own local regions."
"Gender biases in cyberspace: a two-stage model, the new arena of Wikipedia and other websites"^[20] From the abstract: "This Article innovatively argues that the virtual world excludes women in two stages: first, by controlling websites and filtering out women; and second, by exposing women who survived the first stage to a hostile environment. Wikipedia, as well as other cyber-space environments, demonstrates the execution of the model, which results in the exclusion of women from the virtual sphere with all the implications thereof."
"Centralizing content and distributing labor: a community model for curating the very long tail of microbial genomes"'^[21] From the abstract: "We are developing a microbial specific data model, based on Wikidata’s semantic web compatibility, which represents bacterial species, strains and the gene and gene products that define them. Currently, we have loaded 43 694 gene and 37 966 protein items for 21 species of bacteria ..."

References

^ Suin Kim; Sungjoon Park; Scott A. Hale; Sooyoung Kim; Jeongmin Byun; Alice H. Oh (2016-05-12). "Understanding Editing Behaviors in Multilingual Wikipedia". PLoS ONE. 11 (5): e0155305. arXiv:1508.07266. Bibcode:2016PLoSO..1155305K. doi:10.1371/journal.pone.0155305. PMID 27171158.
^ Lu, Xiao (2016). Hidden Gems in the Wikipedia Discussions: The Wikipedians' Rationales. The Workshops of the Tenth International AAAI Conference on Web and Social Media. pp. 96–97.
^ Wyatt, Sally; Harris, Anna; Kelly, Susan E. (2016-02-12). "Controversy goes online: Schizophrenia genetics on Wikipedia". Science & Technology Studies. 29 (1): 13–29. doi:10.23987/sts.55407. ISSN 2243-4690.
^ Malte Schwarzer; Moritz Schubotz; Norman Meuschke; Corinna Breitinger; Volker Markl; Bela Gipp (2016). Evaluating Link-based Recommendations for Wikipedia (PDF). JCDL.
^ Jemielniak, Dariusz; Aibar, Eduard (2016-03-01). "Bridging the gap between wikipedia and academia". Journal of the Association for Information Science and Technology. 67 (7): 1773. doi:10.1002/asi.23691. hdl:10609/92906. ISSN 2330-1643. S2CID 20057427.
^ Konieczny, Piotr (2016-04-01). "Teaching with Wikipedia in a 21st-century classroom: Perceptions of Wikipedia and its educational benefits". Journal of the Association for Information Science and Technology. 67 (7): 1523–1534. doi:10.1002/asi.23616. ISSN 2330-1643. S2CID 7346810.
^ Brox, Hilde (2016-04-05). "Troublesome tools: How can Wikipedia editing enhance student teachers' digital skills?". Acta Didactica Norge. 10 (2): 329–346. doi:10.5617/adno.2493. ISSN 1504-9922. [NPOV] was new to many of them. Some say they are surprised to find that there are so many rules and norms to consider before the text is up to standards. One respondent expressed astonishment that "there are even standards for how to write numbers in percentage!" Others are surprised to find any rules at all, having heard about the inaccuracies and biases of Wikipedia's content: "I used to think anything goes."' ... The students were positive about their discovery of the Wikipedia community, which for many changed some of their attitudes to the site. ... For those who mention trust, they related it to one or both of the following factors: (a) to the discovery of the qualifications of many Wikipedians ("lots of educated people") or (b) to the control mechanism available and that there are people who "check the pages" and "remove unwanted content" ... The initial skepticism expressed in the questionnaire has thus changed, leaving Wikipedia "a place I can partly trust on par with other sources, as it is surveilled by a kind of administrators"."
^ Christensen, T.B. (2015). Wikipedia as a Tool for 21st Century Teaching and Learning. International Journal for Digital Society, 6 (2), pp. 1055–1060.
^ Meseguer-Artola, Antoni; Aibar, Eduard; Lladós, Josep; Minguillón, Julià; Lerga, Maura (2016-05-01). "Factors that influence the teaching use of Wikipedia in higher education". Journal of the Association for Information Science and Technology. 67 (5): 1224–1232. doi:10.1002/asi.23488. hdl:10609/39441. ISSN 2330-1643. S2CID 13566791.
^ Gottschalk, Simon; Demidova, Elena (2016). Analysing Temporal Evolution of Interlingual Wikipedia Article Pairs (PDF). SIGIR.
^ Phetteplace, Eric (2015). "Accidental Technologist: How Can Libraries Improve Wikipedia?". Reference and User Services Association. 55 (2).
^ Figuerola, Carlos G.; Groves, Tamar; Quintanilla, Miguel Angel (2015). "The Implications of Wikipedia for Contemporary Science Education: Using Social Network Analysis Techniques for Automatic Organisation of Knowledge". Proceedings of the 3rd International Conference on Technological Ecosystems for Enhancing Multiculturality. TEEM '15. New York, NY, USA: ACM. pp. 403–410. doi:10.1145/2808580.2808641. ISBN 978-1-4503-3442-6.
^ Ruster, Michael (2016-03-15). ""Did i say something wrong?" A word-level analysis of Wikipedia articles for deletion discussions". Campus Koblenz: Universität Koblenz-Landau. (thesis)
^ Ochola, J. Evans; Persson, Dorothy M.; Schumacher, Lisa A.; Lingo, Mitchell D. (2015-12-14). "Wikipedia: the difference between information acquisition and learning knowledge". First Monday. 20 (12). doi:10.5210/fm.v20i12.4875. ISSN 1396-0466.
^ Karimi, Fariba; Bohlin, Ludvig; Samoilenko, Anna; Rosvall, Martin; Lancichinetti, Andrea (2015-12-15). "Mapping bilateral information interests using the activity of Wikipedia editors". Palgrave Communications. 1: 15041. doi:10.1057/palcomms.2015.41. ISSN 2055-1045. S2CID 16038210.
^ Niebler, Thomas; Schlör, Daniel; Becker, Martin; Hotho, Andreas (2015-12-16). "Extracting Semantics from Unconstrained Navigation on Wikipedia". KI – Künstliche Intelligenz. 30 (2): 163–168. doi:10.1007/s13218-015-0417-5. ISSN 0933-1875. S2CID 256069511.
^ Boldi, Paolo; Monti, Corrado (2016-03-31). "LlamaFur: Learning Latent Category Matrix to Find Unexpected Relations in Wikipedia (Long version)". arXiv:1603.09540 [cs.SI].
^ Meyer, Christian M.; Gurevych, Iryna (2016). Der lexikographische Prozess im deutschen Wiktionary (The lexicographic process of the German Wiktionary). OPAL 2015. p. 82. doi:10.14618/opal_01-2016. (in German)
^ Graham, Mark; Straumann, Ralph K.; Hogan, Bernie (2015-09-07). Digital Divisions of Labor and Informational Magnetism: Mapping Participation in Wikipedia. Rochester, NY: Social Science Research Network. SSRN 2657107.
^ Yanisky-Ravid, Shlomit; Mittelman, Amy (2016-01-01). "Gender Biases in Cyberspace: A Two-Stage Model, the New Arena of Wikipedia and Other Websites". Fordham Intellectual Property, Media and Entertainment Law Journal. 26 (2): 381.
^ Putman, Tim E.; Burgstaller-Muehlbacher, Sebastian; Waagmeester, Andra; Wu, Chunlei; Su, Andrew I.; Good, Benjamin M. (2016-01-01). "Centralizing content and distributing labor: a community model for curating the very long tail of microbial genomes". Database. 2016: baw028. doi:10.1093/database/baw028. ISSN 1758-0463. PMC 4822648. PMID 27022157.

Supplementary references:

^ "Editor Survey Report - April 2011.pdf" (PDF). 30 August 2011.
^ Hale, Scott A. (2014). Multilinguals and Wikipedia Editing. WebSci '14. arXiv:1312.0976.
^ Geiger, R. Stuart; Halfaker, Aaron (2013). Using edit sessions to measure participation in Wikipedia (PDF). CSCW.
^ Warncke-Wang, Morten; Uduwage, Anuradha; Dong, Zhenhua; Riedl, John (2012). "In Search of the ur-Wikipedia: Universality, Similarity, and Translation in the Wikipedia Inter-language Link Network" (PDF). Proceedings of the Eighth Annual International Symposium on Wikis and Open Collaboration. WikiSym '12. New York, NY, USA: ACM. doi:10.1145/2462932.2462959. ISBN 9781450316057.
^ Zia, Leila; Taraborelli, Dario (2016-04-27). "Find, Prioritize, and Recommend: An article recommendation system to fill knowledge gaps across Wikipedia".
^ Meseguer Artola, Antoni; Aibar Puentes, Eduard; Lladós Masllorens, Josep; Minguillón Alfonso, Julià; Lerga Felip, Maura (2014-12-11). "Factors that influence the teaching use of Wikipedia in Higher Education" (Article). Journal of the Association for Information Science and Technology. 67 (5): 1224–1232. doi:10.1002/asi.23488. hdl:10609/39441. S2CID 13566791.
^ Patti Bao; Brent Hecht; Samuel Carton; Mahmood Quaderi; Michael Horn; Darren Gergle (2012). Omnipedia: Bridging the Wikipedia Language Gap (PDF). CHI.
^ Massa, Paulo; Scrinzi, Federico (2012). Manypedia: Comparing Language Points of View of Wikipedia Communities (PDF). WikiSym.

← Previous "Recent research"

Next "Recent research" →

In this issue

28 May 2016 (all comments)

News and notes

Special report

In the media

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

"English still the Lingua Franca of Wikipedia" reminds me of a time when websites that would work well only with M$ Internet Explorer would claim that most of the traffic to them was from IE browsers. I recently handled a workshop where I had to get Hindi speakers to consider contributing content to Wikimedia Commons and we chose the Hindi language option and horror-of-horrors - the interface simply is impossible to understand or incomplete to the extent that it is NOT possible to use. I subsequently checked the situation with several Indian languages and it is very incomplete. Even when one chooses German, the interface is largely translated but the information template for files is still in English. It appeared to me like the Android App - "Upload to Commons" was a much easier target for localization (although user registration would still be a blocker). Internationalization/localization really seems to require a much greater push if there is to be adoption by non-English users. Shyamal (talk) 06:45, 29 May 2016 (UTC)[reply]

Schyzofrenia.. does it even exist?

The first thing about an article about schizophrenia and genetics is that the two parts exist. There is a lot of literature scientific at that that denies schizophrenia exists. That makes the second part irrelevant. The next and obvious question is, what are we talking about. Thanks, GerardM (talk) 07:14, 29 May 2016 (UTC)[reply]

Universitat Oberta de Catalunya paper

This is a minor gripe of mine but we (probably me) already reviewed this here back in Dec 2014 (this is the very same paper, published not January THIS year as the newsletter states, but February LAST one, compare [1] and [2], we probably reviewed a pre-print back then, but any changes if exist are minor). In my relatively comprehensive (or at least I'd like to think so) lit review on the subject from March THIS year that has yet to be reviewed by the Research Newsletter I have a note saying "Meseguer Artola et al.'s (2015) study incorporates and builds on an earlier work of its contributors, Eduard (2014) and Lladós, Aibar, Lerga, Meseguer, and Minguillón (2013), using the same data set and arriving at more refined conclusions. For that reason, those works are not reviewed or cited separately." I was wondering if the said authors published yet another remix of their research, but no, it seems to be a mistake in our review. I suggest removing that section. We have plenty of unreviewed research (hint: dear readers, we have a backlog - help!), no need to discuss the same paper twice. --_{Piotr Konieczny aka Prokonsul Piotrus| reply here} 09:08, 29 May 2016 (UTC)[reply]

Good catch! I have added a mention of the previous review of the preprint. In general, it's not impossible that there may be added value in reviewing the final, published version of a paper that had already been covered as a preprint. But if their content is identical (I haven't checked), that is indeed not necessary. (Still, even though the previous review was much more thorough, this one by Textaural added some important information that had been lacking in the previous one, namely about the methodology - "survey of 800 professors ...".)

"published not January THIS year as the newsletter states, but February LAST one" - I don't know what "the newsletter states" refers to here. [3] gives February 2015 for the "version of Record online" and May 2016 as the journal issue in which it appeared (the latter is cited in the reference here); this kind of discrepancy between formal (or print) and factual (or online) publication date is not too unusual in academic publishing today.

In generally, I at least always try to check if a publication has already been covered before adding it to our todo list for the next issue, by searching the newsletter's archives (this however is affected by a bug in our on-wiki search function that my colleagues from the WMF Discovery team probably won't be able to fix very soon, phab:T129762) and/or our corpus on Zotero (example for this paper - there however we are way behind with tagging those publications that have been covered already). See also the notes on our production process.

To go off on a tangent for a little: The big picture is that while after almost half a decade of its existence, the research newsletter/"recent research" section has built up a very useful corpus of Wikimedia-related research using a pragmatic bibliographic process that keeps the ongoing effort somewhat manageable, this process is still brittle and inefficient in various aspects. I'd love to be able to set aside some time to revamp it with the help of some people who are knowledgeable in this area (some have already offered to help and worked on some parts, but someone would need to take the lead in identifying other needs and tasks and then moving things forward on this, and I at least haven't found the time for that yet). A small step would be to file bugs for the Zotero export issues that hit us basically every month here (also in the case of the paper discussed here [4] [5]). I have left a somewhat longer version of this comment for later reference here.

Regards, Tbayer (WMF) (talk) 22:25, 29 May 2016 (UTC)[reply]

I'm sorry for not double-checking the item before reviewing it, I just trusted the todo list. A brief doesn't do much harm in my opinion, other than to the reviewer's time (but I didn't consider my time to be wasted in this case). Nemo 19:06, 10 June 2016 (UTC)[reply]

Availability of broadband

I have been told if I pay more I too can have broadband at home. I decided not to spend the money. I have no problem contributing at home, but I have often copied the information from other sources at libraries. At one time this was because I couldn't access the information at home, but now the resource that I used the most is unavailable unless I travel about 30 miles. But the truth is I don't have the patience to wait and wait at home. Only those few sites I spend a lot of time on ever approach the speed that they do at libraries. That first time accessing a site (a long way from actual research) can take a very long time.— Vchimpanzee • talk • contributions • 20:23, 30 May 2016 (UTC)[reply]

The difference between information acquisition and learning knowledge

That article would be interesting if it used recent sources. The newest is from 2012, and all but three are from the 2000s. --NaBUru38 (talk) 16:47, 1 June 2016 (UTC)[reply]

Keep up with The Signpost on Twitter, Facebook or Mastodon.

Home

About