A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
Analyzing edits to the then 46 largest Wikipedias between July 9 and August 8, 2013, a study[1] identified a set of about 8,000 contributors (labeled multilingual) with a global user account who have edited more than one of these language versions (excluding Simple English, which was treated separately) in that time frame. It tested five hypotheses about cross-language editing and editors and looked, for instance, at the proportion of contributions that any of these Wikipedias receives from multilingual editors versus contributions from those only editing one language version. The research found that Esperanto and Malay stick out with a high proportion of contributions from multilinguals, and on the other end, that Japanese has few contributions from multilinguals. Overall, in terms of edits per user, multilingual users made more than twice the number of contributions to the study corpus than monolinguals did; they often work on the same topics across language; and in any given language, they are frequently editing articles not edited by monolinguals during the one-month period analyzed here. They thus serve a bridging function between languages.
Two existing write-ups are good starting points to putting the study in context.[supp 1][supp 2] In the long run, it would be interesting to extend the research to (a) cover a longer time span, (b) include contributions from non-registered users, despite technical difficulties, (c) include smaller Wikipedias, and (d) explore the effects of that bridging function in more detail, perhaps in search for ways to support its beneficial effects while minimizing the non-beneficial ones. It would also be interesting to focus on some aspects of those multilingual users (e.g. how do the languages they edit in match with the languages they display on their user pages) or their contributions (e.g. how do their contributions to text, illustrations, references, links, templates, categories or talk page discussions differ across languages, or how contributions from multilinguals differ across topics or between pages with high and low traffic – or to entertain ideas for a multilingual version of editing tools like User:SuggestBot. The paper is one of the first to make use of Wikidata; comparing such cross-lingual Wikipedia contributions with contributions to multi-lingual projects like Wikidata and Commons may also be a fruitful avenue for further research. (See also earlier coverage of a CSCW paper about a similar topic: "Activity of content translators on Wikipedia examined")
A new paper on arXiv asks the question "Can electoral popularity be predicted using socially generated big data?"[2] Operating on the assumption that "sentiment data is implied in information seeking behaviour," the authors Yasseri and Bright compare Wikipedia page views and Google search trends to election outcomes in Iran, Germany and the UK. In Iran and the UK, where the researchers were able to use the articles of individual politicians, the page view and search trend data correctly pick the winners of the elections. In the UK, the data polled even correctly picks the orders of the runners-up, but the same is not true for Iran. In the German case, no correlation is found between search data and election results. Yasseri and Bright defer to the argument from previous studies on Twitter prediction that conclude that the sample data is too self-selecting. Overall, it is shown that "people do not simply search in the same proportions that they vote." Still the researchers note that these techniques react "quickly to the emergence of new 'insurgent' candidates."
A book titled Confidentiality and Integrity in Crowdsourcing Systems contains a chapter on the integrity of the English Wikipedia as a case study of integrity management in crowdsourcing systems.[3] To test the integrity of Wikipedia, they first tried to start a new article with "invalid content" (it got deleted) and then turned to vandalizing pages systematically, both of which violates Wikipedia policies (cf. Wikipedia:Vandalism). They noted that simple cases were caught by automated counter-vandalism tools (ClueBot and XLinkBot, whose user pages – one of them with a typo – are the only references cited in the chapter), whereas more subtle cases ("incorrect information containing words related to the page’s topic" or adding external links present in related Wikipedia articles) were not. No indication was given as to whether these inappropriate edits had later been removed (by the authors themselves or by other users), nor what the affected pages were or what IP address(es) they had used to make those edits.
In a next step, the authors went through dumps of the English Wikipedia from 2001 to 2011 and analyzed revision histories for "100 good and featured articles" (which refers to Wikipedia:Good articles and Wikipedia:Featured articles – later, they call this set "high-quality articles") and "100 non-featured articles" (by which they mean neither good nor featured – later, they refer to this set as "low-quality articles"). In this sample (of which no further details are given), they observed that the number of contributions to high-quality articles is about one order of magnitude higher than that of low-quality articles and "that there is a highly active group of contributors involved from the creation of high quality articles until present", while most editors to low-quality articles never contributed to those pages again. They then looked at revert rates, at the overlap between sets of top contributors to a given article across years, and at the range of topics edited by top contributors to an article, observing that "the top contributors have become the owners of high quality articles and their engagement has increased" (which runs contrary to WP:OWN), "[T]his results in higher quality for a small portion of articles in Wikipedia" and "[T]op contributors of high quality articles are more like- minded than the top contributors of low quality articles", concluding "that the main difference between low quality and featured articles is the number of contributions."
From that, they venture into extrapolating to crowdsourcing systems more generally: "[w]e observe that to have higher integrity in crowdsourcing systems, we need to have a permanent set of contributors who are dedicated for maintaining the quality of the contributions to the articles. For systems with open access such as Wikipedia, this can be a huge burden for the permanent editors. Therefore, we need new mechanisms for coordinating the activities in a crowdsourcing information system." No discussion of these new mechanisms is offered.
The chapter has a few simple tables and plots but no link to the underlying data nor the code used for the analysis, nor links to relevant literature or Wikipedia policies, but it is paywalled behind a price tag of $29.95 / €24.95 / £19.95. Given that the experimental edits to Wikipedia actually damaged the project, it is hard to imagine that an ethical review panel involving Wikipedians might have approved the study in that form. In fact, such a panel does exist in the form of the Research Committee, which had not been contacted about the project. Considering further that the conclusions of the study are not new, their possibly interesting implications for crowdsourcing more generally are not discussed and neither the paper nor its materials are available to those concerned about the integrity of Wikipedia, it is hard to see any benefit of this study that would outweigh the damage it caused (cf. earlier coverage: "Link spam research with controversial genesis but useful results", "Traffic analysis report and research ethics").
This is mostly a list of non-article page requests for comment believed to be active on 22 December 2013 linked from subpages of Wikipedia:RfC, recent watchlist notices and SiteNotices. The last two are in bold. Items that are new to this report are in italics even if they are not new discussions. If an item can be listed under more than one category it is usually listed once only in this report. Clarifications and corrections are appreciated; please leave them in this article's comment box at the bottom of the page.
(This section will include active RfAs, RfBs, CU/OS appointment requests, and Arbcom elections)
We saved one last special report for 2013. After our well-received review of great WikiProject logos a couple years ago, it was only a matter of time before we collected a new batch of interesting iconography that showcases the creativity of the Wikipedia community. Hopefully, these logos will also inspire other projects to liven up their drab pages.
Before we begin, it is important to note that gilded pages do not guarantee a project's success. Slapping a new coat of paint on a flailing WikiProject won't eliminate the project's deeper flaws. We have special reports on reviving WikiProjects, learning from dead projects, and other kernels of knowledge that can help struggling projects.
The list below presents interesting designs that stood out among the many projects surveyed by one Report writer. This list is in no particular order and is by no means exhaustive. For great logos we may have overlooked, we invite our readers to post their favorite WikiProject's logo in the comments section of this report.
WikiProject Wikipedia Awards, home to barnstars and other bits of WikiLove, had a logo purpose built by Antonu, the editor who remastered most of Wikipedia's barnstars and creates new ones for WikiProjects upon request. The logo in its entirety has been translated into Korean and Urdu. The golden barnstar with a laurel wreath, created specifically for the WikiProject Wikipedia Awards logo, was refashioned for the "2.0" version of the WikiProject Barnstar which is awarded to "someone who makes great strides in improving WikiProjects."
The logo for WikiProject Palaeontology is simple yet distinctive, with a Plesiosaurus macrocephalus fossil wrapped around the project's name. The fossil image looks pretty good for a hundred-year-old sketch.
Like the project's dual pursuits, WikiProject Heraldry and Vexillology has dual identifiers. On the left, the "Wikipedia coat of arms" which is described thusly: Or, on a puzzle piece gules a flag waving Or on a flagstaff bendwise argent; for the crest, a flag waving Or on a flagstaff palewise argent issuing from an escutcheon Or issuing from a wreath Or and gules. To the right, the WikiProject Heraldry and Vexillology seal which would look mighty fine on a flag or letterhead.
WikiProject Magic has a logo calling to mind the carnivals, circuses, and other traveling shows where magicians and soothsayers once made a living. While today's illusionists and escape artists demand flashier events, this call-back to a bygone era fits the project's extraordinary members.
What better way is there to identify WikiProject Fashion than with an outfit that's hopelessly out of fashion? This 1920s postcard should remind everyone that some day your children and grandchildren will be laughing at how ridiculous you looked back in 2013.
WikiProject Sharks doesn't need anything flashy. While many sea creatures have dorsal fins, a simple fin-shaped-object poking out of the water immediately brings sharks to mind (often to humorous effect). The morale of this story is that you don't need a DFA to design a decent logo. All it takes is an idea and a little motivation.
The WikiCops are coming for you. WikiProject Law Enforcement has an intricately crafted badge mixing a sheriff star, cop shield, and bobby crown with a laurel wreath and some rays of chivalric order starshine.
The folks at WikiProject Editor Retention mean business. Attracting and keeping editors is a huge challenge for Wikipedia, so it's reassuring to know that the professionals at WikiProject Editor Retention are on it.
The vast array of roadway projects and task forces have something in common that tie them all together: vectorized road signs unique to their corner of the world. If you don't have the time or know-how to create something from scratch, just use the resources that are already at your project's disposal. Logos going left to right are from: WikiProject U.S. Roads, Auto Trails Task Force, U.S. Territories Task Force, WikiProject Canada Roads, and WikiProject Australian Roads.
We ended the first Great WikiProject Logos with a simple yet effective logo from the folks at WikiProject Zoo. Since then, they've updated their look with a wild image clearly inspired by edgy advertising buffers for some televised wildlife programs. The Wikipedians of WikiProject Zoo are clearly excited about their subject.
Did any of these stand out to you? Did we forget your favorite project's icon? Do you have something new you've been working on that you'd like to share? Post it to the comments section below!
Next week, we'll ring in the New Year with our annual retrospective. Until then, revisit years past in the archive.
Reader comments
A significant move by the Wikimedia Foundation has been the broadening of the types of activities it funds. To this end, the Foundation has developed several quite different forums for allocating that funding, setting up volunteer committees that conduct initial assessments of competitive applications. The most recent of these programs was the individual engagement grants (IEG) scheme, launched last January. The scheme awards funds to individuals or teams of up to four people to produce high-impact outcomes for the WMF's online projects. The IEG scheme favours innovative approaches to solving critical issues in the movement. This arm of WMF grantmaking is different from the Funds Dissemination Committee, which started more than a year ago and judges applications for annual operating grants by eligible afilliated organisations.
The IEG committee has just announced the results of its second twice-yearly round. There are seven successful applications for projects that are striking for their reach and diversity, underlining the complex and multidimensional nature of the Wikimedia movement. The allocations—some of them based on applications of impressive quality—involve on-the-ground social, cultural, and technical innovations. Individuals from Cameroon, Uganda, India, Israel, France, Italy, Germany, and the US will begin their projects in the new year, most of which will run from January to June.
A surprisingly large proportion of our editors are under 18, according to Temple-Wood. She and Jake Orlowitz (Ocaasi) have also been provisionally funded to pilot a week-long summer conference, Generation Wikipedia, for young Wikipedians and Wikimedians from around the globe to connect, share skills and build leadership and community capacity among the youngest generation of editors. The conference, for which $20K may be allocated, would stress the particular needs of minors for safety, privacy, and liability protection in such an environment.
The Mbazzi Village writes Wikipedia has its origins in the meeting of two people from very different countries who crossed paths in an exchange program for their students: Paul Kiguba, deputy head at a primary school on a small peninsula that juts into the massive Lake Victoria in Uganda; and Dan Frendin, a teacher in Sweden. Together, they founded the Luganda Wikipedia to serve the local language. An empty house owned by Paul Kiguba in the village of Mbazzi will become a new Wikipedia centre, emphasising the writing of articles on health and agriculture. The project, funded at nearly $3K, will be assisted by Sophie Österberg, who was the WMF's Global Education Manager.
A pilot project in the west African nation of Cameroon will be conducted to develop novel communication tools to promote an international conversation on WMF projects and the sharing of free knowledge. This will follow on from WikiAfrica Cameroon, supported by several institutions, and the French chapter's dynamic Afripédia program, which promotes French-language initiatives in the African WMF world. The centrepieces of the pilot project will be the production of a video and a series of comics, by video-makers, designers, writers, and artists in Cameroon. It will be led by Marilyn Douala Bell and Iolanda Pensa with collaboration from Michael Epacka, with funding of €15K.A further three projects involve technical innovations. Wikimaps Atlas is designed to address a problem many editors are aware of: creating maps for WMF online projects is a labour-intensive process that fails to meet the demand for accuracy and updating. The current system has left us with a large messy pool of locator and other base maps with varying styles, accuracy, and formats. The project will automate the creation of SVG base maps in a well-researched cartographic style using the latest and most accurate open geographic data. Put simply, it will systematically generate a free atlas of the world with well coded SVG files. Arun Ganesh, Hugo Lopez and collaborators will receive $12.5K to achieve this.
VisualEditor, the system of WYSIWYG editing in display mode, has had a controversial start, but will be an inevitable feature of editing on WMF projects. A key challenge is to create a centralised register of all gadgets, with a programmatic understanding of how each relates to editing and an assessment of its popularity across projects. Grantees Eran Roz and Ravid Ziv, have received $4.5K to accomplish this preparatory task and on that basis to integrate high-priority gadgets into VisEd.
Wikidata Toolkit, proposed by Markus Krötzsch, a researcher at the University of Oxford and data architect for the Wikidata project, has been awarded $30K, the highest amount in this round. He will lead a small team of researchers and students at Dresden University of Technology to address a key problem surrounding Wikidata. Wikidata, a relatively new project largely supported by Wikimedia Germany, aims to create a free knowledge base about the world—names, dates, coordinates, relationships, URLs, and references—that can be read and edited by humans and machines alike. However, in Krötzsch's words, "understanding this data requires technical means for querying and analysis that are not currently available. Even skilled developers have hardly any basis for working with Wikidata." The goal is to develop technical components to simplify "query answering" of Wikidata data; in technical terms, a robust and flexible query backend will be created to provide an API for running a variety of queries. The two main outcomes will be a Wikidata toolkit and a query web service.
The next round of IEG proposals will open on 1 March 2014.
Last month, the OAuth extension was deployed to all Wikimedia wikis. OAuth is a standard used for allowing users to authenticate third-party applications, also known as consumers, to take actions on their behalf.
In the past, tools were forced to use systems like TUSC to authenticate users, or store a separate authentication database like UTRS. Now, these applications can take actions using your account without you having to give them your password. For example, you can use the CropTool tool to crop an image on Commons, and the cropped image will be uploaded using your own account with a tag showing that CropTool was used.
Instructions for getting your application set up to use OAuth can be found on mediawiki.org. Currently Dan Garry, the product manager for OAuth, is approving each application before it can be used. That role will transition over to the Stewards after the guidelines for OAuth consumers, which are currently being drafted, are finalised.
More information:
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.