The Signpost

Special report

Who reads which Wikipedia? The WMF's surprising stats

Contribute  —  
Share this
By Tony1
Fig. 1: Monthly views of en.WP per internet user. Majority native English-speaking countries are red, set against more than 40 other countries. Percentages of global en.WP page views are in parentheses.
Fig. 2: The Arab world shows astonishing variety in the use of the Arabic WP (blue) and those of the two colonial languages, English (red) and French (green). Other WPs and the "portal" category are grey. *Appears to be unreliable data.**The grey area is mostly to the Spanish WP.
Fig. 3: Taiwan since 2009. The Chinese WP (green) is aligned with the left-side y-axis; en.WP (red) with the right side. The two y-axes are differently scaled.
Fig. 4: Brazil has seen fluctuations between English (red) and Portuguese (purple).
Fig. 5: Switzerland has gone against the trend by moving away from German (blue), towards en.WP (red) and to a lesser extent French (green).

The Wikimedia Foundation has released its latest report card for the movement's hundreds of sites. The WMF has published statistics since 2009, but only recently have they been expanded in scope and depth to provide a rich source of data for investigating the movement and the world it serves. Erik Zachte, who is from the Netherlands, is the driver of the WMF's statistical output—assisted, he told the Signpost, by "a bunch of colleagues". He has been a Wikipedian since 2002 and the Foundation's data analyst since 2008. Erik writes in his understated way that the report card and accompanying traffic statistics comprise "enough tables, bar charts and plots to keep you busy for a while".

The news is good in terms of the Wikipedias' popularity: monthly page views for the 285 sites rose by a healthy 25% from March 2012 to March 2013, including a 74% rise in views from mobile devices. The Wikipedias are viewed nearly 22 billion times a month—more than 8000 hits a second—or an average of 36 hits a year for every single human, all the more extraordinary for the fact that only about one in four of us uses the internet.

This week, the Signpost gives a thumbnail sketch of some of the statistics concerning page views among the Wikipedias, with a focus on the relationship between the world's major languages—particularly the global role of the English Wikipedia. What we found raises far more questions than it answers, and indicates the extent of the opportunities for using the statistics to analyse both internal and external phenomena.

The English Wikipedia (en.WP) receives 47% of the page views (down from 53% in 2009), and remains dominant among WMF sites. The next most popular WPs are the Spanish and Japanese (at just over 7%), the Russian (nearly 6%), the German (5.4%), and the French (4.2%).

English Wikipedia more popular among many non-native speakers

Surprisingly, the average rate at which internet users view en.WP pages is higher in many countries than in the six major countries with a native English-speaking majority (the US, the UK, Canada, Australia, Ireland, and New Zealand—all red in Fig. 1). Among those six, en.WP is by far the most popular in Canada, with 16 views per month, and would be higher still if adjusted for the fact that more than one in five Canadians is a native speaker of French. The UK and Ireland came in next, with 13 views per month, followed by the US, Australia, and NZ on 11 per month.

The average views of en.WP among internet users in the global north is also 11 per month (roughly three-quarters of all views); Europe, North America, generate the same average; Oceania (Australia, NZ, and surrounding Pacific nations) generated 10; the global south views en.WP six times a month (a quarter of all views).

The Arab world

The tangled consequences of European colonisation are evident in profound differences in WP usage among the two dozen modern nation states that have significant ties to Arabic (Fig. 2). At 79%, the Arabic WP page-view rate is highest in the small state of Comoros off the Tanzanian coast, against 11% for en.WP and 2.4% for the French WP. This turns out to be on the extreme end of Arab usage, with a steady fall to less than a quarter in some countries, in favour of the colonial languages: overall, the Arabic WP is still the minority choice, against the English WP and, in places that were French colonies, the French WP.

These inconsistencies suggest that WP choice is complex and multifactorial: the Signpost has been told that nothing is certain, but factors could include a combination of (i) the proportion of internet users who read English (or French); (ii) the perceived quality and/or scope of the Arabic WP versus that of the English (or French) WPs; and (iii) political, educational, or social pressure to use or avoid a certain WP. Each of these factors, if they did play a part, would probably be the result of a number of component factors. While countries that share other languages—such as in the Spanish-speaking world—also show internal differences in their rate of en.WP views, they are not nearly as pronounced as in the Arab world.

The dynamics over time—country by country since 2009

Aside from the six major English-speaking countries, the WP viewing patterns of almost every country focus almost entirely on two WPs (in a few cases three); English is usually the second most popular, with tiny percentages going to other WPs. Over the past four years, the Arab world has seen particularly sharp movements away from the colonial languages towards the Arabic WP. Egypt, for example, has reversed from a 62/30 English/Arabic split to 40/53; this has been repeated almost exactly in Saudi Arabia, and to a lesser extent in some other Arab nations. Where French is a major choice, it too has tended to recede along with en.WP. To what extent is this related to the Arab Spring, and a sense of increasing pride and independence in Arab culture and language? And to what extent is it a product of any greater scope and depth on the Arabic WP?

Since 2009, this significant move away from en.WP to the WPs of local languages has been repeated around the world, although not usually as dramatically as in Arabic-speaking countries. There are many distinctive and unexplained patterns. A common scenario is a vacillation between the English and local-language WPs, quarter by quarter, with an unexplained shift to and from English in 2010. Taiwan (Fig. 3) shows the swing from Chinese to English in 2010, and another such swing more recently, in a mirror image characteristic of many countries. (Figs. 3–5 have two y-axes, which are scaled differently, and not from zero, to illustrate this mirrored relationship and to save space.)

Brazil shows a similar relation between English and Portuguese, although there has been a slight move towards en.WP over the past six months. Every Portuguese-speaking country had a precipitous drop in the use of the Portuguese WP in 2010, including Angola, Mozambique, Namibia, East Timor, and Portugal itself. The Signpost has yet to ascertain whether this, and indeed the peak in en.WP traffic around the same time, were artefacts of the data-gathering system.

Against the grain, the three German-speaking countries—Germany, Austria, and Switzerland—have all seen a move away from German and towards English. It has been suggested that this may be connected with a resistance by editors to the coverage of popular culture on the German WP. In Switzerland (Fig. 5), where French is also a major language, the popularity of German is more recently eroding in favour of English, and to a lesser extent French. Luxembourg has seen German usage fall significantly in favour of French and English. However, in neighbouring Belgium, both official languages—Dutch and French—have been gaining the edge on English.

Yet more is inexplicable. There has been movement from English to French in Senegal, Cote d'Ivoire, Niger, Guadeloupe, and Haiti; but from French to English in Réunion, Madagascar, and Rwanda, with gyrations between French and English in Zambia, the Democratic Republic of the Congo, the Republic of the Congo, among other African countries. Panama is one of the few Spanish-speaking countries to be moving towards en.WP.

Expatriate choices—just one fascinating area for investigation

Interestingly, some major expatriate groups do not appear to align strongly with the WP of their native tongue: only 0.6% of American page views went to the Spanish WP, yet more than 12% of the US population speaks Spanish at home. Similarly, only 2% of views from Finland are to the Swedish WP, although nearly 6% of Finns are native Swedish-speakers and the language has equal status with Finnish as an official language. The WP preferences of minority language groups appears to be a complex issue. By comparison, large native Russian-speaker groups in countries such as the Baltic states that were assimilated into the Soviet Union for most of the 20th century appear to be using both the Russian and the local-language WPs in greater proportions at the expense of English.

Further information: Wikipedia Report Card: summaries for 50 most visited languages.

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
Note: For stats, charts, and graphs see: commons:Category:Wikimedia statistics.

countries such as the Balkan states that were assimilated into the Soviet Union for most of the 20th century - I'm sorry to disappoint you, but there is only one country that formerly belonged to USSR and is close enough to Balkan to be mixed up. That's Moldova. And in its article in en.wiki, Balkan is not even mentioned. Check your facts. --Oop (talk) 13:17, 5 April 2013 (UTC)[reply]

Thanks for being so polite. I erred, of course, in mixing up Balkan and Baltic—call it a typo. Now corrected. Tony (talk) 13:24, 5 April 2013 (UTC)[reply]

Is the 25% increase in the last year due to link prefetching? Regards, Sun Creator(talk) 14:05, 5 April 2013 (UTC)[reply]

English is one of two official languages in Malta, and 88% of its citizens speak English. Does this qualify it as a "[m]ajority native English-speaking country" (per the first image)? Wrad (talk) 15:57, 5 April 2013 (UTC)[reply]

I had in mind that 2% of Hong Kong people are native speakers of English. I suspect that few Maltese are native speakers. Tony (talk) 23:55, 5 April 2013 (UTC)[reply]
Ok, makes sense. Incidentally, I'm red/green colorblind, and I have trouble distinguishing between lines in some of the lower images. For future Signpost articles, could we use a different color scheme? Wrad (talk) 00:29, 6 April 2013 (UTC)[reply]

How are the monthly views exatly measured? Some very interesting results there. FoCuSandLeArN (talk) 16:25, 5 April 2013 (UTC)[reply]

Thank you for the interesting investigation. One of the surprising points for me was that Comoros actually have the best viewing-ratio in favor of ar WP, yet in over three years of working there I haven't seen a single Comorian editor, I have thought because of that the french was the dominant language for Comoros. In general, it seems that the more developed and educated Arab countries tends to have lower or average views of ar WP, that sounds interesting. Another point worthes mentioning (inspired by "Chequers" comment below) is that, from my own experience, size and growth of a native language's WP is essential for page views. For example, some articles that was developed from few poor sections to FA level in ar WP (Example) had raised page views as much as 20 times in one year. Better content, Much more visitors --aad_Dira (talk) 08:10, 6 April 2013 (UTC).[reply]

Namibia is definitely not a Portuguese-speaking country. I assume the author got confused with the neighbouring country of Angola. LouriePieterse 22:57, 6 April 2013 (UTC)[reply]

Lourie, you're right to question this. Languages of Namibia says "English, the official language, is spoken by less than 1% of people as their native language. Among the white population, 60% speak Afrikaans, 32% German, 7% English, and 1% Portuguese (current figures show that they are in fact 4–5% of the total population of the country nowadays, i.e. 100,000 people)" (my italics). Nevertheless, the WMF's stats show that the share of views from that country to the Portuguese WP has declined from nearly 10% in 2009 to less than 1% now (put "Namibia" into your finder once you've arrived on that big stats page). Can you shed any light on this? And Erik, I wonder whether it's possible to track down whether the sharp dip in Portuguese WP views throughout the world in 2010 is an artefact of collecting the stats? Tony (talk) 05:59, 7 April 2013 (UTC)[reply]
I have a plausible solution to this apparent anomaly. Firstly, I have only quickly looked at the statistics page now, but it would have been nice to see how the total amount of views increased for the country. It is my believe that the amount of Portuguese views stayed relatively the same, but that the other languages increased. I say this because access to internet connectivity increased dramatically in the last few years. It might be possible that the Portuguese views are attributed to higher-class individuals from Angola that only visit Namibia, or even higher-class nationals. Having geolocation information would help with understanding this, as I believe most of this traffic will originate from Windhoek. Secondly, I know some people from Namibia, including a friend who owns an aviation business in the northern areas, but have never heard of Portuguese nationals. I honestly do not know, as I have not even researched my above theory, I only speak out of experience. It is just most plausible in my mind that higher-class Portuguese visitors and nationals are responsible for that traffic and that as more people got access to the internet, that ratio declined. With more statistics, it will be easier to make solid conclusions. :) LouriePieterse 18:58, 7 April 2013 (UTC)[reply]
Namibia as a country had 0.002% of pageviews in 2009/10 and a tenth of that would be 0.0002% of global page views. I not sure how many hundreds of millions of readers we had in that era - but I think it has risen sharply to the 500 million a month of today. So it wouldn't be impossible for one in half a million of our page views in that era to be from one Portuguese speaking person in Namibia if by chance we had an active editor who lived there.. Alternatively it could have been a border anomaly with a Namibian company using one Internet portal for all its offices including one in Angola. I've known companies that ran their IT so that everything ran through head office, and as a result the subsidiary offices would appear to be geolocating to the wrong country. ϢereSpielChequers 19:45, 7 April 2013 (UTC)[reply]

Search engines?

I suspect that some of these transitions are down to search engines. Search engines will vary their results by geography of searcher if they don't have more to go on, and if they shift from defaulting a country to En Wiki to AR wiki then the result could be significant.

Size of native language wiki will also be an issue, for people who speak languages such as Maltese and Slovenian a very high proportion of subjects will not be covered in their native language.

Another thing that will vary over time is the taught foreign language in a country. In much of the former Soviet block anyone born before 1975 will have been educated in a society where Russian was at least the first foreign language one learned, but the younger generation will have probably been taught English. Perversely as the Internet becomes more common in such countries it will spread from the young who usually have some English to the old whose non-native language is Russian. ϢereSpielChequers 08:10, 6 April 2013 (UTC)[reply]

Readers choose Wikipedia over or by google?
Search engine changes is a good guess. I wonder how the google Knowledge Graph (deployed 2012) influences stats, does it link to english WP articles if you search in another language without a WP article?
My first thought about these stats was: Wikistats artefact? I read discussions about excluding bot pageviews in stats (which is done with a broad brush so far?) and about the strange mobile view stats. There are many questions. BTW, i have to say that the speculation about a sudden german popular-culture-interest spike is a really silly explanation! --Atlasowa (talk) 19:47, 7 April 2013 (UTC)[reply]
One problem with excluding bots is in identifying the bots. The easy way is to look at accounts with the bot flag currently set on, that has the not inconsiderable problem in exercises like this when you look at old edits that some very active bots have been retired and deflagged. Hence the importance of also looking at lists of formerly flagged bots. I think the German popular culture theory is that the DE wiki has become less focussed on that area and thereby not covered that part of the market so well, rather than a change of interests amongst German internet readers. However I don't speak German so can't check the theory. ϢereSpielChequers 08:49, 8 April 2013 (UTC)[reply]
Atlas, I'm sure a lot of people are interested in why German-speakers appear to be visiting the German WP less and other WPs more. I'd be very pleased if you could provide some hypotheses here. Tony (talk) 13:10, 8 April 2013 (UTC)[reply]

Comoros

Comoros had bugs in its stats, I think they were corrected : Wiki stats gives 80% French, 10% English et 2 % Arabic (http://stats.wikimedia.org/archive/squid_reports/2012-12/SquidReportPageViewsPerCountryBreakdownHuge.htm). — Preceding unsigned comment added by Loup Solitaire 81 (talkcontribs) 01:51, 31 January 2015 (UTC)[reply]



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0