The Signpost

News and notes

Bangla-language survey suggests the challenges for small Wikipedias

Contribute  —  
Share this
By Jan eissfeldt and Tony1

Bangla Wikipedia survey

Bangla belongs to the eastern group of the Indo-Aryan languages, here marked in yellow.

The Bangla language, also known as Bengali, is spoken by some 200 million people in Bangladesh and India. The Bangla Wikipedia has a very small community of just 10–15 very active editors, with another 35–40 as less active editors. The project faces particular challenges in being a small Wikipedia, and Dhaka-based WMF community fellow Tanvir Rahman is working to understand these challenges and develop strategies to improve small wikis that have strong potential to expand their editing communities (Signpost coverage).

During July 2012, Tanvir conducted an online survey of more than 1800 Bangla Wikipedia readers, a response over just two weeks that was beyond expectations; of these participants, 1107 answered all 29 questions. Like all online surveys, the advantage is the relatively large sample size, which increases statistical strength, although there is the likelihood of some self-selection bias. Of the 1107 completed surveys, 25% of the participants count themselves as editors of the project, and 75% as readers who had never edited the site; 81.2% were from Bangladesh, with 16.3% from India and 2.5% from other countries, including the US and the UK. The issues surveyed concerned readability, editing, help, and community support of the Bangla Wikipedia. The survey also provided the first-ever demographic information of volunteers editing this language project.

62.2% of participants are students, and this matches the largest age-range in the survey, of 16–26 years. The results have established that college and school students make up the largest group of readers and editors of Bangla Wikipedia, and that this group feels the project has been very useful for their studies. In Bangladesh, students usually have better internet access than the overall population, which may be a factor in this result. Ironically, it seems that the English Wikipedia plays an important role in promoting the Bangla Wikipedia, since most participants learned of the existence of the Bangla Wikipedia from the English Wikipedia's other languages links. Other sources include search engines, social networks, and newspaper reports on the project and outreach events (Signpost coverage).

In a finding that will have a familiar ring to English Wikipedians, new users feel inhibited by the current lack of proper help pages and other technical issues. However, there the similarities end: while two-thirds of participants find Bangla Wikipedia useful, readers pointed out several limitations that need to be tackled. Like other small Wikipedias, a lack of information in articles is the most commonly raised issue for the Bangla Wikipedia. More than 700 responses included open suggestions for how to improve the project's readability.

Most know they can edit, but ...

The majority of participants say they know they can edit the Bangla Wikipedia, but only a quarter actually do edit. Most readers feel the need for a guideline on where to start editing; it appears that the current help-page system – where known at all to a reader – fails to provide convenient and useful help to newbies.

Bangla help and policy pages are mostly translated versions of English Wikipedia guidelines, although the actual editing environment in Bangla is much simpler than on the more developed English-language project. On the other hand, the English Wikipedia community and the WMF have recognised that the help pages on the English Wikipedia are deficient, and the system is currently subject to review and redesign (Signpost coverage).

The editing interface is the second major factor that Bangla participants feel holds them back. Many are unfamiliar with wikicode, and those who have learned how to use it are on their own thanks to the lack of useful documentation. Several participants expressed the hope that a Bangla version of Visual Editor might eventually solve this problem.

An example of Bengali script: the word Wikipedia.

As Bangla is written in its own script, 17.5% of the participants mentioned that they don't know how to type Bangla in computers. Bangla script has 49 characters (not including hundreds of consonant conjuncts), which makes it more difficult to work in than in English. It's now possible to write Bangla using English with the phonetic keyboard layout, and this typing tool in turn is embedded in Wikipedia.

Bangladesh has more than 92 million mobile phone users, with an unknown number in the Indian state of West Bengal; 90% of the total internet users in Bangladesh gain access to the internet through mobile services. However, few people said they browse Wikipedia from mobile phones (typing Bangla into a mobile phone is a complicated affair).

New way to help newbies
63.4% of participants stated they don't know where to find help on Wikipedia.

56% of all participants said they would like to get help in editing, but don't know where they can ask for support from Bangla Wikipedia. A majority would prefer a step-by-step help environment, rather than traditional help pages. Newbies, the survey finds, shouldn't be expected to know much before starting to edit. People would also like help from Wikipedians online, so a mentorship program like the English Wikipedia's adopt-a-user could be of value.

Aside from basic help issues, 57% of the survey participants would like to help translate content from the English Wikipedia to Bangla, and 60% would like help for new content creation. The English Wikipedia is the biggest information source for other Wikipedias' translation activities, and it is often easier and less time-consuming to develop a Wikipedia by translating articles from English.

Demographic findings

Almost all participants completed their university and college bachelor graduation; 11.8% were high-school graduates alone. (In Bangladesh, the home-country of most participants, high school is from grades 6–10, and college is grades 11 and 12.)

The gender gap is an issue on the Bangla Wikipedia: only 5.6% of participants are female, and just 21% of female respondents have edited the site. Of those who did not, after learning from the survey instructions that they can edit, most expressed interest in contributing. Women were particularly keen to have a step-by-step help system.

The way forward

The survey findings will be a starting point for developing incremental help pages in Bangla, which will be intended to tackle the problems revealed by this survey. The starting point will be the translation of the help space from the English Wikipedia, with an eye to developments on that project towards simplification and greater effectiveness. It will aim to provide newbies and interested editors with an adequate help structure in line with the expressed preferences of the respondents. Discussion on the new help system is underway with the local community, and the help experiment is expected to go live next month.

Brief notes

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • Yesterday I was helping a new editor on the English Wikipedia, and their first language was Bangla. They were clearly struggling with English, so I gave them a link to the Bangla Wikipedia, and suggested it might be easier to contribute there. They said they were already aware of the existence of the Bangla Wikipedia, but were only interested in contributing on the English one, despite the language problems. Being utterly uncurious about such things, I didn't ask why. But one can speculate that search engine results might have a lot to do with it. --Demiurge1000 (talk) 11:01, 14 August 2012 (UTC)[reply]
  • Demiurge1000, the editor you came across had better skills in Bangla, but wanted to contribute in English. It might be due to the reason that IT education in India is given only in English. Students are not taught to "type" in regional or national languages and if they wish to, they have to go for special classes which are not in the syllabus (which anyone will avoid as studies are not easy in India). This might be one of the reasons why they wished to stay on enwiki. TheSpecialUser TSU 14:22, 14 August 2012 (UTC)[reply]
  • Typing in vernacular languages is what has kept me away from Bengali and Hindi Wikipedia. Learning to use the free software is a daunting task. Wonder if the Wikimedia India Chapter can arrange for typing classes for those interested in editing in vernacular languages in India.--  Forty two  17:57, 18 August 2012 (UTC)[reply]
  • The language typing help is part of the standard coverage of Wiki Academy conducted for Non English audiences by Wikimedia India. As phonetic input methods are supported natively in Wikipedia, it is even more easier for people to overcome this hurdle. --Arjunaraoc (talk) 05:10, 21 August 2012 (UTC)[reply]
  • This is a fascinating look into the community base of a small Wikipedia project associated with a major world language in a multinational population which is increasingly using Internet communication. I encourage anyone who is interested in the topic of expansion of Wikipedia into new languages to consider the extent to which this study accurately depicts the user base which establishes new projects. In every way that I have considered this data I found it very encouraging. I hope that this study has a follow up because I think this report is meaningful and that Bangla language would be an excellent target for Wikimedia development because of the great need and great potential. Blue Rasberry (talk) 14:07, 14 August 2012 (UTC)[reply]
  • Style note: It might be helpful to spell out acronyms such as "FDC" at the point where they are first used. ~ Ningauble (talk) 14:59, 14 August 2012 (UTC)[reply]
I've done a lot of clean-up work on South Asian topics. I've also interacted with IPs from South Asia. Finally, I used to be an admin on Meta (our cross-wiki coordination site) and I did clean-up work across many small language Wikipedias - mostly spam removal. My observations:
  • I've been amazed by how much Wikipedias for some major language groups are languishing. Bangla is a prime example. Yes, most of those 200 million speakers are poor and many may be poorly educated -- yet 100,000s have doctorates and millions more are very well-educated.
  • We have 1000s of very active South Asian editors -- both established accounts and IPs. Probably most of those people aren't editing South Asian language Wikipedias.
  • I've only run cross-wiki contribution checks on anonymous vandals and spammers; I've noticed that the South Asian IPs seldom have done any editing on the South Asian Wikipedias. On the one hand, that's good -- we're drawing some undesirable edits away from these smaller projects. But it's also bad, since most of those are shared IPs (colleges, Internet cafes, dial-up modems) used by multiple people to edit Wikipedia, so I see so many good edits from those IPs -- also not being made on South Asian wikis.
  • I've sometimes wondered if perhaps the South Asian intelligentsia look to English for educational and technical topics at work or school, then use their native languages out in town or at home. Could this be why our English Wikipedia has such a vibrant South Asian readership and editing community?
  • I've noticed this English vs. local language gap most with South Asian contributors. Tens of millions speak Yoruba and our Yoruba Wikipedia is tiny and not very active, yet Nigerians, Togolese and Beninois seem proportionally less active on the English Wikipedia than South Asians.
  • These are just observations and speculation. I've never even been to South Asia although I am appreciative of how much my own country and even my own street have been enriched by immigration from India, Pakistan, Bangladesh, and Sri Lanka (still no Bhutanese in my town).
Some suggestions:
  • For our South Asian topics (geography, culture, Bollywood, etc.) -- ditch the fund-raising banner ads. Instead, serve up banner ads cross-promoting the associated Wikipedia. In some cases, the language for promotion will be obvious -- Tamil culture article, Tamil Wikipedia. In other cases, there may be multiple languages -- our India article covers a country with dozens of languages; in such cases, maybe have the banner ad lead to a landing page with links to all the candidate projects.
    • A variation on this idea requiring a little work on the server side: tailor the banner ad to the originating IP address -- if it's a Bangladesh IP, serve up a banner ad promoting the Bangla language. If it's a generic Indian IP, serve up a banner ad leading to a multi-language landing page.
  • Conduct a study of edits per language area in each language by Wikipedia to determine which projects have the biggest gap between English language and native language participation. Compare that with the languages and areas that have the least gap. Use that as a starting point to better understand what's going on.
  • The English Wikipedia was initially seeded with a lot of public domain content such as the 1911 Encyclopaedia Britannica. There may be similar resources in some of these other languages that are in the local public domain (government documents, etc.)
  • Similar things can be done with other languages (such as Yoruba) as well.
--A. B. (talkcontribs) 10:59, 15 August 2012 (UTC)[reply]
Here's an example of what I pointed out above --, located in Mumbai, is the source of 829 English Wikipedia edits and just 5 combined edits to our Hindi and Telugu Wikipedias. --A. B. (talkcontribs) 11:40, 15 August 2012 (UTC)[reply]
  • re: Questia accounts Kudos to Ocaasi for his work on getting the community much needed sources for referencing our articles. Referencing is difficult work, but our reliability depends on it. His proposed WP:The Wikipedia Library will be a great resource for improving our quality and I thank him for his dedication in this area. (talk) 19:22, 15 August 2012 (UTC)[reply]
  • Comments and questions.

    AB is right, I suspect. Ironically, those with university degrees and in professional and management positions are more likely to have a knowledge of English, which is seen as prestigious because it opens up the outside world. They are among those who you'd want to attract into the Bangla-WP editing community. BUT, alongside those people there is surely a demographic of educated, motivated, internet-connected people who can be motivated to contribute to Bangla. How can their motivation be reinforced? And what is the state of internet connections in the language area—possibly an important part of the jigsaw puzzle for us to know.

    Perhaps it should be a multi-pronged approach, both through real-life activities by the Bangladesh and Indian chapters, and through explicit invitations at relevant en.WP article talk pages to translate and/or improve equivalent articles in the Bangla WP. What kind of chapter activities would be the most effective? Is there scope for collaboration between the two chapters, and if so, are key personal relationships being built as a platform for serving the Bangla-language community? What inhibiting factors work against the participation of women? And more: who's got ideas for building the narrative of the inside world of Bangla-speakers? Tony (talk) 02:00, 16 August 2012 (UTC)[reply]

  • Not too many people search in world wide web in Bengali language too! --Tito Dutta 18:06, 16 August 2012 (UTC)[reply]


The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0