A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
In several Wikipedia-based systems and scientific analyses, researchers have assumed that no two articles in Wikipedia represent the same concept, i.e. a semantically closed description of a specific item, for example "New York City". Lin et al. however published a paper at CSCW'17[1] where they showed that this “article-as-concept” assumption does in fact not hold: The abovementioned article about "New York City" has a separate sub-article about the "History of New York City", which describes a topic very closely related to “New York City” and could at the same time easily be merged into the original article. This way of splitting up lengthy articles into several smaller ones ("summary style", more specifically "article size") may improve readability for human users, but seriously impairs many studies based on the “article-as-concept” assumption. Using a simple classification approach on features based on both the link structure as well as semantic aspects of the title and the context, the authors identified 70.8% of the top 1000 visited pages which have been split up into articles and sub-articles, with an average of 7.5 sub-articles per article, thus stating that the existence of sub-articles is not the exception, but the rule.
A drawback with the proposed sub-article relationship detection method, as stated in the paper, is that it is trained only on explicitly encoded sub-article relationships; it is yet unsure how to detect implicit relationships, i.e. where no editor has linked the sub-article with the main article. Still, this presents the first step into a deeper analysis of the Wikipedia page network to make it at the same time better readable for humans, but also easily exploitable for many algorithms.
A survey among 1,354 German academic researchers about their professional use of social media found Wikipedia to be the most widely used site as of 2015, with 84.7%.[2] Among German internet users in general, 79% use Wikipedia. Only 2% of these Wikipedia readers think it's "never reliable" and 80% hold it is "mostly" ("größtenteils") reliable.[3] A report by the German Monopolkommission (which advises the government on antitrust matters) on potential monopoly problems in the Internet search engine market highlighted Wikipedia as the top 10 website in Germany that is by far the most dependent on Google, with around 80% of its traffic (according to third-party data from SimilarWeb that is not quite consistent with the Wikimedia Foundation's own data).[4]
In France, surveys by the Institut national de la statistique et des études économiques (INSEE) found that from 2011 to 2013, the ratio of people who use the internet to consult Wikipedia ("or any other collaborative online encylopedia") rose from 39% to 51%. Wikipedia usage was higher among younger internet users and among those with degrees - 82% among 16-24 year olds, 54% among 25-54 year olds, and only 31% among 55-74 year olds.[5] The corresponding Eurostat data gave 45% for the entire European Union as of 2015.[6]
In contrast, Ofcom found that only 2-4% of UK 12-15 year olds use Wikipedia as first stop for information as of 2015.[7]
In the meantime, a 2016 Knight Foundation report, based on a study by Nielsen, found that "Among mobile sites [in the US], Wikipedia reigns in terms of popularity (the app does well too) and amount of time users spend on the entity. Wikipedia’s site reaches almost one-third of the total mobile population each month".[8]
See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines.
Other recent publications that could not be covered in time for this issue include the items listed below. contributions are always welcome for reviewing or summarizing newly published research.
Discuss this story