The Signpost

Recent research

French medical articles have "high rate of veracity"; quality comparisons across languages; perceptions of credibility

Contribute   —  
Share this
By Nicolas Jullien, Leila Zia, Tilman Bayer, and FULBERT

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Medical articles on French Wikipedia have "high rate of veracity"

Reviewed by Nicolas Jullien

A doctoral thesis[1] at Aix-Marseille University examined the accuracy of medical articles on the French Wikipedia. From the English abstract: "we selected a sample of 5 items (stroke, colon cancer, diabetes mellitus, vaccination and interruption of pregnancy) which we compare, assertion by assertion, with reference sources to confirm or refute each assertion. Results: Of the 5 articles, we analyzed 868 assertions. Of this total, 82.49% were verified by the referentials, 15.55% not verifiable due to lack of information and 1.96% contradicted by the referentials. Of the contradicted results, 10 corresponded to obsolete notions and 7 to errors, but mainly dealing with epidemiological or statistical data, thus not leading to a major risk when used, not recommended, on health. Conclusion: ... This study of five medical articles finds a high rate of veracity with less than 2% incorrect information and more than 82% of information confirmed by scientific references. These results strongly argue that Wikipedia could be a reliable source of medical information, provided that it does not remain the only source used by people for that purpose."

This medical PhD thesis is a very well documented analysis of the questions raised by the publication of medical information on Wikipedia. Although the findings, summarized in the abstract, will not be new to those who know Wikipedia well, it presents a good review of the literature on the topic of medical accuracy, and also of the purpose of Wikipedia (not a professional encyclopedia, but a form of popular science, an introduction, and some links to go further). This document is in French.

Assessing article quality and popularity across 44 Wikipedia language versions

Reviewed by Nicolas Jullien
From the paper: Distribution of quality scores in 12 topic areas on English, German and French Wikipedia
Overlaps of the English, German and French Wikipedia's coverage of universities. The authors provide an interactive online tool to generate such Venn diagrams for other topic areas and language combinations.

This is the topic of a paper in the journal Informatics[2]. From the English abstract: "Our research has showed that in language sensitive topics, the quality of information can be relatively better in the relevant language versions. However, in most cases, it is difficult for the Wikipedia readers to determine the language affiliation of the described subject. Additionally, each language edition of Wikipedia can have own rules in the manual assessing of the content’s quality. There are also differences in grading schemes between language versions: some use a 6–8 grade system to assess articles, and some are limited to 2–3. This makes automatic quality comparison of articles between various languages a challenging task, particularly if we take into account a large number of unassessed articles; some of the Wikipedia language editions have over 99% of articles without a quality grade. The paper presents the results of a relative quality and popularity assessment of over 28 million articles in 44 selected language versions. Comparative analysis of the quality and the popularity of articles in popular topics was also conducted. Additionally, the correlation between quality and popularity of Wikipedia articles of selected topics in various languages was investigated. The proposed method allows us to find articles with information of better quality that can be used to automatically enrich other language editions of Wikipedia."

Regarding the quality metrics, I salute the coverage in terms of languages, which allows to go beyond the "official" automated evaluation provided by the Wikimedia Foundation (ORES) that is only available on some big language projects. As the authors explained, this part is mostly based on a work already published, but fairly extended. It also proposes some solutions to the quality comparisons between different languages, and takes into account the variations of perspectives between different cultures.

It also opens a discussion about the popularity of articles, and how this can help to choose which master language has to be chosen when an article exists. Although this part is just at its beginning, their discussion makes the next step for their work, looking forward.

N

From the paper: Distribution of various article metrics by quality class on English Wikipedia


Reviewed by FULBERT

This theoretical paper[3] explored ambiguous relationships between credibility, trust, and authority in library and information sciences and how they are related to perceived accuracy in information sources. Credibility is linked to trust, necessary when we seek to learn from or convey information between people. This is complicated when the authority of a source is considered, as personal or institutional levels of expertise increase the ability to speak with greater credibility.

The literature about how this works with knowledge and information on the Web is inconsistent, and as a result this work sought to develop a unified approach through a new model. As credibility, trust, and authority are distinct concepts that are frequently used together inconsistently, they were explored through how Wikipedia is used and perceived. While Wikipedia is considered highly accurate, trust in it is average while its credibility is at times suspect.

Sahut and Tricot developed the authority, trust and credibility (ATC) model, where “knowledge institutions confer authority to a source, this authority ensures trust, which ensures the credibility of the information.” As a result, “the credibility of the information builds trust, which builds the authority of the source.” This model can be useful when applying to the citation of sources in Wikipedia, as it helps explain how the practice of providing citations in Wikipedia increases credibility and thus encourages trust, “linking content to existing knowledge sources and institutions.”

The ATC model is a helpful framework for explaining how Wikipedia, with its enormous readership, continues to suffer from challenges to being perceived as an authority due to its inconsistencies in article citations and references. This theorizes that filling these gaps will increase authority and thus the reputation of Wikipedia itself.

Figure 2 from the paper, on Wikipedia authority, trust and credibility. ("The educational institution can spread a bad reputation on Wikipedia, which decreases its authority, has a negative influence on its trust, which negatively influences the credibility of the information. Conversely, a positive experience of credibility of Wikipedia information increases readers’ trust.")

Conferences and events

Academia and Wikipedia: Critical Perspectives in Education and Research

A call for papers has been published for a conference titled "Academia and Wikipedia: Critical Perspectives in Education and Research", to be held on June 18, 2018, at Maynooth University in the Republic of Ireland. The organizers describe it as "a one-day conference that aims to investigate how researchers and educators use and interrogate Wikipedia. The conference is an opportunity to present research into and from Wikipedia; research about Wikipedia, or research that uses Wikipedia as a data object".

Wiki Workshop 2018

The fifth edition of Wiki Workshop will take place in Lyon, France on April 24, 2018, as part of The Web Conference 2018. Wiki Workshop brings together researchers exploring all aspects of Wikimedia websites, such as Wikipedia, Wikidata, and Wikimedia Commons. The call for papers is now available. The submission deadline for papers to appear in the proceedings of the conference is January 28, all other papers on March 11.

See the research events page on Meta-wiki for other upcoming conferences and events, including submission deadlines.

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. contributions are always welcome for reviewing or summarizing newly published research.

Compiled by Tilman Bayer

OpenSym 2017

Illustration from "Interpolating quality dynamics in Wikipedia and demonstrating the Keilana effect"

See also our earlier coverage of another OpenSym 2017 paper: "Improved article quality predictions with deep learning"

OpenSym 2016

Plot describing the change, from October 2014 to January 2016, in the absolute number of female biography articles (horizontal axis) and their share among all biographies (vertical axis), for various Wikipedia languages (appearing in similar form in the "Monitoring the Gender Gap ..." paper)

See also our earlier coverage of another OpenSym 2016 paper: "Making it easier to navigate within article networks via better wikilinks"

Diverse other papers, relating to structured data

Figure from "Scholia and scientometrics with Wikidata" (screenshot of https://tools.wmflabs.org/scholia/author/Q20980928 )

References

  1. ^ Antonini, Sébastien (2017-06-22). "Étude de la véracité des articles médicaux sur Wikipédia". Aix Marseille Université.
  2. ^ Lewoniewski, Włodzimierz; Krzysztof, Węcel; Abramowicz, Witold (2017-06-22). "Relative Quality and Popularity Evaluation of Multilingual Wikipedia". Informatics. 4 (4). Informatics 2017, 4(4), 43: 43. doi:10.3390/informatics4040043.
  3. ^ Sahut, Gilles; Tricot, André (2017-10-31). "Wikipedia: An opportunity to rethink the links between sources' credibility, trust, and authority" (PDF). First Monday. 22 (11). doi:10.5210/fm.v22i11.7108. ISSN 1396-0466.
  4. ^ Piscopo, Alessandro; Vougiouklis, Pavlos; Kaffee, Lucie-Aimée; Phethean, Christopher; Hare, Jonathon; Simperl, Elena (2017). What do Wikidata and Wikipedia have in common?: An analysis of their use of external references (PDF). OpenSym '17. New York, NY, USA: ACM. pp. 1–1–1:10. doi:10.1145/3125433.3125445. ISBN 9781450351874.
  5. ^ Kaffee, Lucie-Aimée; Piscopo, Alessandro; Vougiouklis, Pavlos; Simperl, Elena; Carr, Leslie; Pintscher, Lydia (2017). A glimpse into Babel: An analysis of multilinguality in Wikidata (PDF). OpenSym '17. New York, NY, USA: ACM. pp. 14–1–14:5. doi:10.1145/3125433.3125465. ISBN 9781450351874.
  6. ^ Lanamäki, Arto; Lindman, Juho (2017). Before the sense of 'we': Identity work as a bridge from mass collaboration to group emergence (PDF). OpenSym '17. New York, NY, USA: ACM. pp. 5–1–5:9. doi:10.1145/3125433.3125451. ISBN 9781450351874.
  7. ^ Halfaker, Aaron (2017). "Interpolating quality dynamics in Wikipedia and demonstrating the Keilana effect" (PDF). Proceedings of the 13th International Symposium on Open Collaboration. OpenSym '17. New York, NY, USA: ACM. pp. 19–1–19:9. doi:10.1145/3125433.3125475. ISBN 9781450351874.
  8. ^ Betancourt, Grace Gimon; Segnine, Armando; Trabuco, Carlos; Rezgui, Amira; Jullien, Nicolas (2016). "Mining team characteristics to predict Wikipedia article quality". Proceedings of the 12th International Symposium on Open Collaboration. OpenSym '16. New York, NY, USA: ACM. pp. 15–1–15:9. doi:10.1145/2957792.2971802. ISBN 9781450344517.
  9. ^ Agrawal, Rakshit; deAlfaro, Luca (2016). "Predicting the quality of user contributions via LSTMs" (PDF). Proceedings of the 12th International Symposium on Open Collaboration. OpenSym '16. New York, NY, USA: ACM. pp. 19–1–19:10. doi:10.1145/2957792.2957811. ISBN 9781450344517.
  10. ^ Klein, Maximilian; Konieczny, Piotr; Zhu, Haiyi; Rai, Vivek; Gupta, Harsh (2016). Monitoring the gender gap with Wikidata human gender indicators (PDF). OpenSym 2016. Berlin, Germany. p. 9.
  11. ^ Zangerle, Eva; Gassler, Wolfgang; Pichl, Martin; Steinhauser, Stefan; Specht, Günther (2016). "An empirical evaluation of property recommender systems for Wikidata and collaborative knowledge bases" (PDF). Proceedings of the 12th International Symposium on Open Collaboration. OpenSym '16. New York, NY, USA: ACM. pp. 18–1–18:8. doi:10.1145/2957792.2957804. ISBN 9781450344517.
  12. ^ Tamime, Reham Al; Hall, Wendy; Giordano, Richard (2016). Medical science in Wikipedia: The construction of scientific knowledge in open science projects (PDF). OpenSym '16. New York, NY, USA: ACM. pp. 4–1–4:4. doi:10.1145/2962132.2962141. ISBN 9781450344814. (extended abstract)
  13. ^ Silbernagl, Doris; Krismer, Nikolaus; Specht, Günther (2016). "Comparing OSM area-boundary data to DBpedia" (PDF). Proceedings of the 12th International Symposium on Open Collaboration. OpenSym '16. New York, NY, USA: ACM. pp. 11–1–11:4. doi:10.1145/2957792.2957806. ISBN 9781450344517.
  14. ^ Nielsen, Finn Årup; Mietchen, Daniel; Willighagen, Egon (2017-05-28). "Scholia, Scientometrics and Wikidata". The Semantic Web: ESWC 2017 Satellite Events. European Semantic Web Conference. Lecture Notes in Computer Science. Vol. 10577. Springer, Cham. pp. 237–259. doi:10.1007/978-3-319-70407-4_36. ISBN 9783319704067.
  15. ^ Andra Waagmeester, Egon Willighagen, Núria Queralt Rosinach, Elvira Mitraka, Sebastian Burgstaller-Muehlbacher, Tim E. Putman, Julia Turner, Lynn M Schriml, Paul Pavlidis, Andrew I Su, and Benjamin M Good: Linking Wikidata to the rest of the Semantic Web. Proceedings of the 9th International Conference Semantic Web Applications and Tools for Life Sciences. Amsterdam, The Netherlands, December 5-8, 2016. (conference poster)
  16. ^ Subercaze, Julien (May 2017). Chaudron: Extending DBpedia with measurement. Portoroz, Slovenia: Eva Blomqvist, Diana Maynard, Aldo Gangemi.
  17. ^ Ludovic Font A, Amal Zouaq A, B, Michel Gagnon: Assessing and Improving Domain Knowledge Representation in DBpedia
  18. ^ Agathos, Michail; Kalogeros, Eleftherios; Kapidakis, Sarantos (2016-09-05). "A Case Study of Summarizing and Normalizing the Properties of DBpedia Building Instances". In Norbert Fuhr; László Kovács; Thomas Risse; Wolfgang Nejdl (eds.). Research and Advanced Technology for Digital Libraries. Lecture Notes in Computer Science. Springer International Publishing. pp. 398–404. doi:10.1007/978-3-319-43997-6_33. ISBN 9783319439969. Closed access icon
  19. ^ Kejriwal, Mayank; Miranker, Daniel P. (2016). "Experience: Type alignment on DBpedia and Freebase". p. 10. arXiv:1608.04442 [cs.DB].
  20. ^ Bhargava, Preeti; Spasojevic, Nemanja; Hu, Guoning (2017-03-13). "High-Throughput and Language-Agnostic Entity Disambiguation and Linking on User Generated Data". arXiv:1703.04498 [cs.IR].
  21. ^ Prasojo, Radityo Eko; Darari, Fariz; Razniewski, Simon; Nutt, Werner. Managing and Consuming Completeness Information for Wikidata Using COOL-WD (PDF). KRDB, Free University of Bozen-Bolzano, 39100, Italy.{{cite book}}: CS1 maint: location (link)
  22. ^ Hernández, Daniel; Hogan, Aidan; Riveros, Cristian; Rojas, Carlos; Zerega, Enzo (2016-10-17). "Querying Wikidata: Comparing SPARQL, Relational and Graph Databases". The Semantic Web – ISWC 2016. International Semantic Web Conference. Lecture Notes in Computer Science. Vol. 9982. Springer, Cham. pp. 88–103. doi:10.1007/978-3-319-46547-0_10. ISBN 9783319465463. Closed access icon author's preprint
  23. ^ Hernández, Daniel; Hogan, Aidan; Krötzsch, Markus (2015). Reifying RDF: What Works Well With Wikidata?. Proceedings of the 11th International Workshop on Scalable Semantic Web Knowledge Base Systems. Vol. 1457 of CEUR Workshop Proceedings. pp. 32–47.
S
In this issue
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
I interpreted the slope of the arrows as showing the degree of concentration on adding female biographies. A high slope (jawiki) means that the added biographies concentrate more on females than the ratio on that wiki beforehand. A flat line would mean the added biographies are of the same ratio as existed beforehand. A downward slope (zhwiki) would mean the added biographies concentrated on men more than was already present on that wiki. Jujutacular (talk) 16:28, 18 December 2017 (UTC)[reply]
Yes, I think you're right. So we can match the slope with the rightward placement to get an idea of the composite impact, I guess. Tony (talk) 05:19, 19 December 2017 (UTC)[reply]



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0