The Signpost

Recent research

Wehrmacht on Wikipedia, neural networks writing biographies

Contribute   —  
Share this
By Tilman Bayer and Bri

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

The Battle for Wikipedia

Color image of protesters in late 20th century clothing holding signs showing a soldier in Wehrmacht uniform
Does Wikipedia lack true historical consensus on the actions of the Wehrmacht? Is our story skewed towards apologetic historiography like Lost Victories and the protesters against Wehrmachtsausstellung, shown here?
Reviewed by Bri

The "clean Wehrmacht" battle covered in the past three issues of The Signpost (May, June, July) is reviewed from a historian's perspective in The Journal of Slavic Military Studies.[1] The title of the paper is an allusion to Lost Victories, today generally accepted as an unreliable and apologetic account of the actions of German forces during World War II. The author, David Stahel, who states that he is not a Wikipedia editor, examines the behind-the-scenes mechanisms and debates that result in article content, with the observation that these debates are not consistent with "consensus among serious historians" and "many people (and in my experience students) invest [Wikipedia] with a degree of objectivity and trust that, at least on topics related to the Wehrmacht, can at times be grossly misplaced...articles on the Wehrmacht (in English Wikipedia) might struggle to meet [the standard]". The author describes questionable arguments raised by several of the pro-Wehrmacht editors and concludes their writing "may in some instances reflect extremist views or romantic notions not grounded in the historiography".

Readers prefer summaries written by a neural network over those by Wikipedians 40% of the time – but it still suffers from hallucinations

Reviewed by Tilman Bayer

Several recent publications tackle the problem of taking machine-readable factual statements about a notable person, such as their date of birth from the Wikidata item about them, and creating a biographical summary in natural language.

A paper[2] by three researchers from Australia reports on using an artificial intelligence approach for "the generation of one-sentence Wikipedia biographies from facts derived from Wikidata slot-value pairs". These are modeled after the first sentences of biographical Wikipedia articles, which, the authors argue, are of particular value because they form "clear and concise biographical summaries". The task of generating them involves making decisions about which of the facts to include (e.g. the date of birth or a political party that the subject is a member of), and arranging them into a natural language sentence. To achieve this in an automated fashion, the authors trained a recurrent neural network (RNN) implemented in TensorFlow on a corpus of several hundred thousand introductory sentences extracted from English Wikipedia articles about human, together the corresponding Wikidata entries. (Although not mentioned in the paper, such first sentences are the subject of a community guideline on the English Wikipedia, at least some aspects of which one might expect the neural network to reconstruct from the corpus.)

An example the algorithm's output compared to the Wikipedia original (excerpted from Table 5 in the paper):

Wikipedia original robert charles cortner ( april 16 , 1927 may 19 , 1959 ) was an american automobile racing driver from redlands , california .
Algorithm variant "S2S" bob cortner ( april 16 , 1927 may 19 , 2005 ) was an american professional boxer.
Algorithm variant "S2S+AE" robert cortner ( april 16 , 1927 may 19 , 1959 ) was an american race-car driver .

The quality of the algorithm's output (in several variants) was evaluated against the actual human-written sentences from Wikipedia (as the "gold standard") with a standard automated test (BLEU), but also by human readers recruited from CrowdFlower. This "human preference evaluation suggests the model is nearly as good as the Wikipedia reference", with the consensus of the human raters even preferring the neural network's version 40% of the time. However, those of the algorithm's variants that are allowed to infer facts not directly stated in the Wikidata item can suffer from the problem of AI "hallucinations", e.g. the struck-out parts in the above example, claiming that Bob Cortner was a boxer instead of a race-car driver, and died in 2005 instead of 1959.

Apart from describing and evaluating the algorithm, the paper also provides some results about Wikipedia itself, e.g. showing which biographical facts are most frequently used by Wikipedia editors. Table 1 from the paper lists "the top fifteen slots across entities used for input, and the % of time the value is a substring in the entity’s first sentence" in the examined corpus:

Fact Count %
TITLE (name) 1,011,682 98
SEX OR GENDER 1,007,575 0
DATE OF BIRTH 817,942 88
OCCUPATION 720,080 67
CITIZENSHIP 663,707 52
DATE OF DEATH 346,168 86
PLACE OF BIRTH 298,374 25
EDUCATED AT 141,334 32
SPORTS TEAM 108,222 29
PLACE OF DEATH 107,188 17
POSITION HELD 87,656 75
PARICIPANT OF 77,795 23
POLITICAL PARTY 74,371 49
AWARD RECEIVED 67,930 44
SPORT 36,950 72

The paper's literature review mentions a 2016 paper titled "Neural Text Generation from Structured Data with Application to the Biography Domain"[3] as "the closest work to ours with a similar task using Wikipedia infoboxes in place of Wikidata. They condition an attentional neural language model (NLM) on local and global properties of infobox tables [...] They use 723k sentences from Wikipedia articles with 403k lower-cased words mapping to 1,740 distinct facts".

While the authors of both papers commendably make at least some of their code and data available on GitHub (1, 2), they do not seem to have aimed to make their algorithms into a tool for generating text for use in Wikipedia itself – perhaps wisely so, as previous efforts in this direction have met with community opposition due to quality concerns (e.g. in the case of a paper we covered previously here: "Bot detects theatre play scripts on the web and writes Wikipedia articles about them").

In the third, most recent research effort, covered in several publications,[4][5][6] another group of researchers likewise developed a method to automatically generate summaries of Wikipedia article topics via a neural network, based on structured data from Wikidata (and, in one variant, DBpedia).

They directly worked with community members from two small Wikipedias (Arabic and Esperanto) to evaluate "not only the quality of the generated text, but also the usefulness of our end-system to any underserved Wikipedia version", when extending the existing ArticlePlaceholder feature that is in use on some of these smaller Wikipedias. The result was that "that members of the targeted language communities rank our text close to the expected quality standards of Wikipedia, and are likely to consider the generated text as part of Wikipedia. Lastly, we found that the editors are likely to reuse a large portion of the generated summaries [when writing actual Wikipedia articles], thus emphasizing the usefulness of our approach to its intended audience."

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions are always welcome for reviewing or summarizing newly published research.

Compiled by Tilman Bayer

The Wikipedia Adventure: Beloved but ineffective

A vaguely anthropomorphic cartoon character
"The Wikipedia Adventure: Field Evaluation of an Interactive Tutorial for New Users"[7]

From the accompanying blog post: "The system was a gamified tutorial for new Wikipedia editors. Working with the tutorial creators, we conducted both a survey of its users and a randomized field experiment testing its effectiveness in encouraging subsequent contributions. We found that although users loved it, it did not affect subsequent participation rates."

See also: research project page on Meta-wiki, podcast interview, podcast coverage, Wikimedia Research Showcase presentation

Told you so: Hindsight bias in Wikipedia articles about events

Two papers by the same team of researchers explore this topic for Wikipedia editors and readers, respectively:

"Biases in the production and reception of collective knowledge: the case of hindsight bias in Wikipedia"[8]

From the paper:

Study 1: This study investigated whether events in Wikipedia articles are represented as more likely in retrospect. For a total of 33 events, we retrieved article versions from the German Wikipedia that existed prior to the event (foresight) or after the event had happened (hindsight) and assessed indicators of hindsight bias in those articles [...] we determined the number of words of the categories "cause" (containing words such as "hence"), "certainty" (e.g., "always"), tentativeness (e.g., "maybe"), "insight" (e.g., "consider"), and "discrepancy" (e.g., "should"), because the hindsight perspective is assumed to be the result of successful causal modeling [...] There was an increase in the proportion of hindsight-related words across article versions. [...] We investigated whether there is evidence for hindsight distortions in Wikipedia articles or whether Wikipedia’s guidelines effectively prevent hindsight bias to occur. Our study provides empirical evidence for both.

"Cultural Interpretations of Global Information? Hindsight Bias after Reading Wikipedia Articles across Cultures"[9]

From the abstract: "We report two studies with Wikipedia articles and samples from different cultures (Study 1: Germany, Singapore, USA, Vietnam, Japan, Sweden, N = 446; Study 2: USA, Vietnam, N = 144). Participants read one of two article versions (foresight and hindsight) about the Fukushima Nuclear Plant and estimated the likelihood, inevitability, and foreseeability of the nuclear disaster. Reading the hindsight article increased individuals' hindsight bias independently of analytic or holistic thinking style. "

"WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval"

From the abstract:[10] "...we introduce a new Wikipedia based collection specific for non-factoid answer passage retrieval containing thousands of questions with annotated answers and show benchmark results on a variety of state of the art neural architectures and retrieval models."

"Analysis of Wikipedia-based Corpora for Question Answering"

From the abstract:[11] "This paper gives comprehensive analyses of corpora based on Wikipedia for several tasks in question answering. Four recent corpora are collected, WikiQA, SelQA, SQuAD, and InfoQA, and first analyzed intrinsically by contextual similarities, question types, and answer categories. These corpora are then analyzed extrinsically by three question answering tasks, answer retrieval, selection, and triggering."

"Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia"

From the abstract:[12] "We study the task of generating from Wikipedia articles question-answer pairs that cover content beyond a single sentence. We propose a neural network approach that incorporates coreference knowledge via a novel gating mechanism. [...] We apply our system [...] to the 10,000 top-ranking Wikipedia articles and create a corpus of over one million question-answer pairs."

Asking Wikidata questions in natural language

From the abstract:[13] "We first introduce a new approach for translating natural language questions to SPARQL queries. It is able to query several KBs [knowledge bases] simultaneously, in different languages, and can easily be ported to other KBs and languages. In our evaluation, the impact of our approach is proven using 5 different well-known and large KBs: Wikidata, DBpedia, MusicBrainz, DBLP and Freebase as well as 5 different languages namely English, German, French, Italian and Spanish." Online demo: https://wdaqua-frontend.univ-st-etienne.fr/

References

  1. ^ Stahel, David (18 July 2018). "The Battle for Wikipedia: The New Age of 'Lost Victories'?" (PDF). Historical. The Journal of Slavic Military Studies. 31 (3). Routledge: 396–402. doi:10.1080/13518046.2018.1487198. eISSN 1556-3006. ISSN 1351-8046. OCLC 7781539362. Wikidata 55972890. Retrieved 28 August 2018. Open access icon
  2. ^ Chisholm, Andrew; Radford, Will; Hachey, Ben (3–7 April 2017). "Learning to generate one-sentence biographies from Wikidata" (PDF). In Lapata, Mirella; Blunsom, Phil; Koller, Alexander (eds.). Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017). Valencia, Spain: Association for Computational Linguistics. pp. 633–642. arXiv:1702.06235v1. doi:10.18653/v1/E17-1060. ISBN 978-1-945626-34-0. ACL Anthology E17-1060. Wikidata 28819478. Archived from the original on 29 August 2018. Retrieved 28 August 2018. {{cite conference}}: External link in |conference= (help) Open access icon
  3. ^ Lebret, Rémi; Grangier, David; Auli, Michael (1–5 November 2016). "Neural Text Generation from Structured Data with Application to the Biography Domain" (PDF). In Su, Jian; Duh, Kevin; Carreras, Xavier (eds.). Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing (EMNLP 2016). Austin, Texas: Association for Computational Linguistics. pp. 1203–1213. doi:10.18653/v1/D16-1128. ISBN 978-1-945626-25-8. ACL Anthology D16-1128. Archived (PDF) from the original on 29 August 2018. Retrieved 29 August 2018. {{cite conference}}: External link in |conference= (help) Open access icon
  4. ^ Kaffee, Lucie-Aimée; Elsahar, Hady; Vougiouklis, Pavlos; Gravier, Christophe; Laforest, Frédérique; Hare, Jonathon; Simperl, Elena (14 February 2018). "Mind the (Language) Gap: Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders". In Gangemi, Aldo; Navigli, Roberto; Vidal, María-Esther; Hitzler, Pascal; Troncy, Raphaël; Hollink, Laura; Alam, Mehwish (eds.). The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings. Extended Semantic Web Conference (ESWC 2018) (Preprint). Lecture Notes in Computer Science. Vol. 11 (Online ed.). Cham, Switzerland: Springer Science+Business Media (published 3 June 2018). pp. 319–334. doi:10.1007/978-3-319-93417-4_21. eISSN 1611-3349. ISBN 978-3-319-93417-4. ISSN 0302-9743. LCCN 2018946633. OCLC 7667759818. LNCS 10843. Wikidata 50290303. Archived from the original on 29 August 2018. Retrieved 29 August 2018 – via Silvio Peroni. {{cite conference}}: External link in |conference= (help) Free access icon
  5. ^ Kaffee, Lucie-Aimée; Elsahar, Hady; Vougiouklis, Pavlos; Gravier, Christophe; Laforest, Frédérique; Hare, Jonathon; Simperl, Elena (1–6 June 2018). "Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata". In Walker, Marilyn; Ji, Heng; Stent, Amanda (eds.). Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL HLT 2018). New Orleans, Louisiana: Association for Computational Linguistics. pp. 640–645. arXiv:1803.07116v2. doi:10.18653/v1/N18-2101. ISBN 978-1-948087-29-2. OCLC 7667759818. ACL Anthology N18-2101. Wikidata 50827579. RG 323905026. {{cite conference}}: External link in |conference= (help) Open access icon
  6. ^ Vougiouklis, Pavlos; Elsahar, Hady; Kaffee, Lucie-Aimée; Gravier, Christophe; Laforest, Frédérique; Hare, Jonathon; Simperl, Elena (30 July 2018). "Neural Wikipedian: Generating Textual Summaries from Knowledge Base Triples" (PDF). Journal of Web Semantics. Elsevier. arXiv:1711.00155. doi:10.1016/j.websem.2018.07.002. eISSN 1873-7749. ISSN 1570-8268. OCLC 7794877956. Wikidata 45322945. Archived (PDF) from the original on 29 August 2018. Retrieved 29 August 2018. Open access icon
  7. ^ Narayan, Sneha; Orlowitz, Jake; Morgan, Jonathan; Hill, Benjamin Mako; Shaw, Aaron (25 February – 1 March 2017). "The Wikipedia Adventure: Field Evaluation of an Interactive Tutorial for New Users" (PDF). Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW '17). New York, NY: Association for Computing Machinery. pp. 1785–1799. doi:10.1145/2998181.2998307. ISBN 978-1-4503-4335-0. Wikidata 37816091. Archived (PDF) from the original on 29 August 2018. Retrieved 29 August 2018. {{cite conference}}: External link in |conference= (help) Open access icon
  8. ^ Oeberst, Aileen; Beck, Ina von der; Back, Mitja D.; Cress, Ulrike; Nestler, Steffen (17 April 2017). "Biases in the production and reception of collective knowledge: the case of hindsight bias in Wikipedia" (DOC). Psychological Research (Preprint). Berlin; Heidelberg: Springer Berlin Heidelberg: 1–17. doi:10.1007/s00426-017-0865-7. eISSN 1430-2772. ISSN 0340-0727. OCLC 7016703631. PMID 28417198. Wikidata 29647478. Archived from the original on 29 August 2018. Retrieved 29 August 2018 – via ResearchGate. Free access icon
  9. ^ Beck, Ina von der; Oeberst, Aileen; Cress, Ulrike; Nestler, Steffen (22 May 2017). "Cultural Interpretations of Global Information? Hindsight Bias after Reading Wikipedia Articles across Cultures". Applied Cognitive Psychology. 31 (3). John Wiley & Sons: 315–325. doi:10.1002/acp.3329. eISSN 1099-0720. ISSN 0888-4080. OCLC 7065844160. Wikidata 30062753. Closed access icon
  10. ^ Cohen, Daniel; Yang, Liu; Croft, W. Bruce (8–12 July 2018). "WikiPassageQA: A Benchmark Collection for Research on Non-factoid Answer Passage Retrieval". SIGIR #41 Proceedings. 41st International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR '18). New York, NY: Association for Computing Machinery (published 27 June 2018). pp. 1165–1168. arXiv:1805.03797v1. doi:10.1145/3209978.3210118. ISBN 978-1-4503-5657-2. {{cite conference}}: External link in |conference= (help) Closed access icon
  11. ^ Jurczyk, Tomasz; Deshmane, Amit; Choi, Jinho D. (5 February 2018). "Analysis of Wikipedia-based Corpora for Question Answering". arXiv:1801.02073v2 [cs.CL].
  12. ^ Du, Xinya; Cardie, Claire (15–20 July 2018). "Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia" (PDF). In Miyao, Yusuke; Gurevych, Iryna (eds.). Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018). Melbourne, Australia: Association for Computational Linguistics. pp. 1907–1917. arXiv:1805.05942v1. doi:10.18653/v1/P18-1177. ISBN 978-1-948087-32-2. ACL Anthology P18-1177. Archived (PDF) from the original on 29 August 2018. Retrieved 29 August 2018. {{cite conference}}: External link in |conference= (help) Open access icon
  13. ^ Diefenbach, Dennis; Both, Andreas; Singh, Kamal; Maret, Pierre (17 June 2018). Polleres, Alex (ed.). "Towards a Question Answering System over the Semantic Web". Semantic Web – Interoperability, Usability, Applicability (Preprint). 0 (0 [1]). IOS Press. arXiv:1803.00832. eISSN 2210-4968. ISSN 1570-0844. Wikidata 50418915. Archived from the original on 29 August 2018. Retrieved 29 August 2018. Open access icon
S
In this issue
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • Agree "pro-Wehrmacht" is uncalled for, and is not supported by materials presented in the case which made no such finding regarding any editor. Bri, would you consider removing this particular phrase, or at least indicating a direct attribution to the offwiki author so the term does not appear to be in Wikipedia's (or The Signpost's) voice? -- Euryalus (talk) 10:38, 1 September 2018 (UTC) Probably needless disclaimer - the request that this particular phraseology in the article be reconsidered is my personal opinion and not on behalf of Arbcom. -- Euryalus (talk) 11:20, 1 September 2018 (UTC)[reply]
  • The research doesn't lend itself well to succinct phrasing which is needed in this review. The nearest relevant phrase I can find to the passage in question is "too many of the editors are clearly keen to see articles that reflect their own cherished, or at least uncritical, view of the Wehrmacht and its collaborators." I think "pro-Wehrmacht" is a fair and accurate condensation of that thought. I think it will remain as written. ☆ Bri (talk) 13:16, 1 September 2018 (UTC)[reply]
  • Thanks for the reply. Stahel certainly accuses some editors of being "pro-Wehrmacht," without offering much evidence. Charitably, perhaps that's a consequence of the need for brevity in his piece. I suppose my question is whether the current Signpost wording implies that the Signpost accepts Stahel's accusation as fact, or simply notes that Stahel said it.-- Euryalus (talk) 13:41, 1 September 2018 (UTC)[reply]
  • I agree that this wording is unfair to the editors here. Stahel actually uses the term "pro-Wehrmacht", but only in relation to "the notoriously pro-Wehrmacht J.J. Fedorowicz Publishing" company, not Wikipedia editors. He uses much more convoluted wording to describe the Wikipedia editors he focuses on, but the gist of it is that he sees their views of the German military as being outdated rather than actually advocating on its behalf as the term "pro" implies (e.g. "The problem is as much about what is written as what is left out and sometimes what is removed by editors acting, consciously or unconsciously, to preserve the myth of a ‘clean Wehrmacht’"). More broadly, the term doesn't reflect the findings of the arbitration case, or my years of experience with the editors in question. It should be removed. Nick-D (talk) 08:17, 6 September 2018 (UTC)[reply]
  • Belatedly, agree with Nick-D. The issue is the article's use of "pro-Wehrmacht," in The Signpost's voice, to describe editors named in the referenced article. Stahel certainly implies that accusation, though he doesn't say it explicitly. The Arbcom case outcome goes nowhere near such a claim. Of course it's up to the editorial group on whether to keep or remove the sentence, though as a passing reader of the Signpost I'd urge removal. But if the editorial decision is to keep this sentence, perhaps it could be amended with words like "editors who Stahel implies are pro-Wehrmacht" to make absolutely clear it is merely reporting someone else's views. -- Euryalus (talk) 08:58, 6 September 2018 (UTC)[reply]

FWIW, I would think it should be worth mentioning that a passage Stahel complains about having been cut, at the bottom of page 398, cited his own work as a source? He may have less disinterested motives in writing this than it seems. Daniel Case (talk) 20:51, 2 September 2018 (UTC)[reply]

Moreover, it is largely surprising that, after being so vocal about historical context, there is not even the smallest allusion to the fact that large white-washing campaigns have been undertaken in the past, to allow the recycling of the defeated mass-murderers into murderers with our God on their side. MacArthur protecting Hirohito, Churchill protecting Kesselring, and so on were not isolated facts... but describing this situation as orchestrated by Wikipedia is too large a brush... and slightly anachronistic.
Stahel concludes by saying: "the best advice for students is not to risk Russian roulette on the Internet and instead seek peer-viewed literature from the library"... and just the next line, the same students are advised that "David Stahel has written four books about the Wehrmacht’s operations on the eastern front with Cambridge University Press". What a marvelous coincidence ! Pldx1 (talk) 08:44, 6 September 2018 (UTC)[reply]



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0