The title of last week's piece, "The Tragedy of Wikipedia's commons" was perhaps rather more ironic than its author intended. One of the truly great tragedies of medieval England was not so much the tragedy of the commons in its original sense but the forcible enclosure by powerful outside interests of the historic common land that had for centuries been available as a free resource for all. If there is any tragedy here, it is in the author's wish to use Wikipedia to take over Wikimedia Commons and to do very much the same thing online.
Commons always has and always will have a far broader free-content remit than that of supporting the narrow focus of an encyclopaedia. Commons provides media files in support not just the English Wikipedia but all of the WMF projects, including Wikisource, Wikibooks, Wikivoyage and many more. These sister projects of Wikipedia often have a need to use media on Commons that could never be used on the Wikipedias as they are not - in Wikipedia's narrow sense - "encyclopaedic". Some of Commons' detractors like to give the impression that its collections are nothing more than a dumping ground for random non-educational content. Nothing could be further from the truth, and the energy expended by those who would criticise from the outside (but who are strangely reluctant to engage on wiki) bears little relation to the extremely small proportion of images that could in any way be considered contentious.
Commons' policies are of necessity different and more wide ranging than any of the individual projects. We hold many images that will never be useful to the English Wikipedia, and that is not only OK, but should be welcomed as Commons' contribution to the overall mission of the Wikimedia Foundation, "to empower and engage people around the world to collect and develop educational content under a free license or in the public domain, and to disseminate it effectively and globally". Note that the overall mission of the WMF is not "to write an encyclopaedia", but rather to develop and disseminate educational content. Supporting the English Wikipedia is one way, but by no means the only way, in which we do that, and the idea that Commons should be forcibly subjugated to the policies of a specialist encyclopaedia project would do immeasurable harm to the mission which I had hoped we were all working to support.
Contrary to the suggestion that the Commons policy on scope of 2008 was an "unchallenged action by a tiny group of people", it was in fact largely an exercise in documenting for the first time the unwritten long-established practices of the community. The policy attracted very little controversy (despite it being very widely advertised, on Wikipedia and elsewhere) largely because the vast majority of it was uncontentious. Indeed, the fact that it has retained very wide community support since then indicates that we didn't do too bad a job.
With its specialised emphasis on media curation and the niceties of copyright law, Commons will never be as popular a place for editors to hang out as some of the bigger encyclopaedias. It requires not only a particular set of interests, but also at least for admins some level of specialist knowledge which not everyone has or is interested to acquire. Those outside the local community who only see the external carping may not realise that we have thousands of very committed editors who work tirelessly in the background curating and categorising content and bringing to the attention of the admins non-educational content that has no place in our collections.
Commons has never (as was claimed last week) been merely a repository that supports its sister WMF projects. Right from the start it had a remit to make content freely available to external re-users. As early as 2006 there was a formal proposal (since implemented as InstantCommons) to integrate into Mediawiki a mechanism specifically designed to support users on non WMF projects. Perhaps the real worry of last week's author was that Commons currently holds too many non-encyclopedic images of a sexual nature. But even assuming that is true, a proposal to revoke one of the fundamental free content aims of Commons hardly seems proportionate. Instead, let's have a proper discussion on what Commons' scope should be. Times change, as do priorities, and what made sense five years ago may now perhaps need to be revisited.
Over the last few months especially there has been a lot of discussion within Commons as well as outside about issues concerning the small proportion of our holdings that relate to sexual imagery and to privacy/the rights of the subject. Both have complex moral and legal dimensions, and neither has yet been fully resolved. I've set out the main strands of argument below, as objectively as I can, for those who may not be familiar with them. Of course, these summaries are by no means the whole story, and many of the discussions are far more subtle than I have space for here, so please bear with me if you are familiar with this and feel I have mis-characterised or omitted any important point that may be close to your own heart. I deliberately make no comment on the validity of any of these arguments.
Some argue that pornographic images (as defined in some way) are never appropriate for any of the Wikimedia projects and are simply not educational.
Others argue that we should keep most images, almost whatever the subject matter, as we need to show the whole range of human experience if we are to call ourselves a comprehensive educational resource. Anything else would be censorship.
Yet others suggest that not all the sexual images held by Commons are "educational", properly defined. Some are photographs that have been taken for non-educational purposes, for example personal gratification/entertainment, and/or have been uploaded for the same purpose or by users who wish to push an extreme view that equates any limits at all with unacceptable "censorship".
Finally, some hold that Commons has too many images in certain marginally-educational areas that, taken overall, create an oppressive or threatening environment (e.g. for women) which may be harming the project as a whole.
One strand of argument is that we should do more to respect the rights of individuals who are identifiable in a photograph, and recognise that, even where the image may be legal, it can be highly damaging to the individual. Even when an outsider might naively think the image unremarkable, it may still be considered threatening, harassing or oppressive by its subject.
Another strand is that allowing the subject of a photograph a say on whether it should stay on Commons or not opens the door to all sorts of censorship. Proponents argue it's essential that we are able to collect all types of educational image, including those that may offend the subject.
If there is indeed a problem with the boundaries of Commons' scope - perceived or otherwise - we should tackle it head-on with open community discussion. Commons should be and I believe is receptive to the views of everyone within the Wikimedia community in reviewing its curatorial policies. But the way to get things changed is to engage rather than to criticise from afar.
A comprehensive review of Commons' scope is just starting now, and you need never say again that your voice cannot be heard. Please talk.
Please visit Commons' Review of Scope pages now, and make your views known for the sake of all the Wiki communities.
Commons has proved to be a phenomenal success in the years since its introduction, and we should be proud of what has been achieved. We should keep it, improve it, and celebrate it.
Last week, the Signpost published a rather scathing op-ed about Wikimedia Commons, the Wikimedia project which seeks to be a resource of free, educational media. Perhaps you feel it presented a valid argument, perhaps not, that's for you to make up your mind on. I would like to take this chance to offer a defence of Commons.
As you probably know, Wikimedia Commons acts as a central repository for images. Once an image is on Commons, any project can use it, exactly the same way they can use their own images. It's an incredibly valuable tool for the Wikimedia project as a whole, as it prevents duplication and provides a central place to search. You want an image of something for your Wikipedia article? Commons probably has a category for it. And that is the same whether you're editing in English, German, Arabic or even Tagalog.
I first joined Commons back in October 2007, when I was working on an eclectic mix of the Ffestiniog Railway and McFly. About six months later I became a Flickrreviewr, checking uploads from Flickr that for some reason couldn't be checked by a bot, and a month or so after that I became an admin, primarily so I could deal with all the copyright violations I came across with the Flickr work. In the five years since my interest in admin duties has waxed and waned, and I had little side-projects, but Commons had swiftly become my home-wiki. My watchlist has some 60,000 pages on it, of which 10,000 are my own photos.
Commons has its problems, I cannot deny that. The number of people who believe that because they found a photo on Google it can be uploaded to Commons is simply staggering. The search engine is designed for pages not images (a limitation of the software). The community can be a bit fractured, it can be very difficult to get people blocked for being terminally incapable of working with others (even when their name comes back to the admin noticeboards week after week after week), and we have remarkably little in the way of actual policy. Indeed our main guiding principles boil down to two pages: Commons:Licensing and Commons:Project Scope. The former tells us what files we're allowed, the latter which we want. Scope is the real issue of the moment, and in a nutshell it says that Commons collects educational media. Which brings the question, "what is educational?"
A similar problem has existed on Wikipedia for years - what is notable? There are even factions - deletionists, who think articles must prove their notability, and inclusionists, who think that there's no harm in letting potentially non-notable articles stay. And so it is on Commons - those who adhere to a strict definition of educational, and those who accept a somewhat looser guide.
And this dispute would be fine, if it were argued on Commons and in the abstract. But that is not what happens. The major rift happened a few years ago, when, apparently due to a disparaging Fox News article about the amount of "porn" on Wikipedia, Jimbo Wales, co-founder of Wikimedia, came onto Commons and starting deleting sexuality images. That didn't really go over well with the Commons community, of which Jimbo has never been a part, especially when it was found he was deleting images which were in use on multiple projects. To cut a long story short, the deleted images were restored and Jimbo lost admin rights at Commons, as did several admins who had joined him in his purge. Many of the images Jimbo deleted were in fact subsequently deleted again, following deletion requests to allow for Community input. But the deed had been done, and for a large proportion of the Commons community, it appeared that Jimbo was not to be trusted to have the best interests of the project at heart.
The issue stewed for a few years, and reemerged with a vengeance last year. Again, it has been fought almost entirely over what some describe, disparagingly, as "porn". As I mentioned earlier, the Commons search engine is not really designed for images, and so it tends to give unexpected results. One of those results was the search "toothbrush" returning a picture of a woman using an electric toothbrush for self-pleasure as one of the top results. This was entirely a legitimate result - it was a picture of a toothbrush, and it was titled as such. And while the so-called "principle of least astonishment" can easily be applied to categories - Commons has a whole proliferation of "nude or semi-nude people with X" categories on the grounds that nudity should not appear in the parent category "X" - it doesn't really work for a search algorithm, not if you want to continue with correct categorisation. Until the Wikimedia Foundation develops some form of search content filter (which itself brings up issues of what exactly should be filtered - should images of Muhammed be filtered out? What about Nazi images due to German law?) all that can really be done is to either delete the image or rename it to try and reduce the chances of an innocuous search returning it. I personally favour keeping the images, and this has led me to be named as part of a "porn cabal" by people, most of whom rarely if ever edit on Commons, who favour deleting the images.
But the issue, for me, is that these issues so rarely get brought up on Commons. Instead of using the deletion request system to highlight potentially problematic images (which is after all what the process is for), the detractors would rather just soapbox on Wikipedia - usually on Jimbo's talk page - about how awful Commons is, and how this latest penis photo proves once and for all that I (or some other member of the "porn cabal") am the worst admin in the history of forever and deserve to be shot out of a cannon into a pit of ravenous crocodiles. What people don't seem to understand is that in large part, I do agree. Commons has problems. We do have too many low quality penis pictures - so many that we even have a policy on it - and so I have a bot which searches new uploads for nudity categories and creates a gallery so I can see any problematic ones, and thus nominate them for deletion. This somehow seems to make me an even worse admin in many people's eyes. We should indeed have better checks to ensure that people in sexual pictures consented to having their pictures uploaded, and I would like to see a proper policy on this. I'd like to see the community as a whole have a reasoned discussion on the matter, for a policy to be drafted, amended, voted on and finally adopted. But that is very difficult when you feel you are under attack all the time, where your attackers are not willing to actually work with you to create a better project.
Wikimedia projects are based around collaboration and discussion within the community. I would urge those of you who feel that Commons is "broken" to come to Commons and offer constructive advice. Attacking long-term Commons users will get you nowhere, nor will pasting links on other projects, or on Jimbo's talk page. If you truly want to make Commons a better place, and are not in fact just looking for any reason to tear it down, then come to Commons. Come to the village pump - tell us what is wrong, and how you feel we could do better. Use the systems we have in place for project discussions to discuss the project. Sitting back and sniping from afar does nothing for your cause, and it only embitters the Commons community.
Come and talk to us.
Reader comments
The season finale of Game of Thrones ensured that the epic high fantasy series would dominate the top 10 again last week; however, it was joined by the perennially popular children's author Maurice Sendak, whose 85th birthday was celebrated with a Google Doodle, and by the number one movie of the week, Man of Steel. Politics rarely impacts the top 10, but the controversy over the PRISM surveillance program proved too potent to miss.
Please see here for the top 25 articles of the week, plus analysis.
For the week of 8 to 15 June, the ten most popular articles on Wikipedia, as determined from the report of the 5,000 most trafficked pages* were:
Rank | Article | Views | Notes |
---|---|---|---|
1 | Maurice Sendak | 1,717,368 | A Google Doodle to celebrate the children's author's would-have-been 85th birthday sent almost 2 million people to his Wikipedia page. |
2 | Man of Steel (film) | 1,117,658 | The second attempt to rework the Superman mythos for modern cinema, (after Bryan Singer's Superman Returns) this film earned $125.1 million over its first weekend, setting a record for the month of June. |
3 | Game of Thrones | 1,000,649 | The season finale of this popular TV show drew 5.39 million viewers; its highest rating ever. |
4 | State of Decay (video game) | 715,148 | Much anticipated zombie apocalypse video game. |
5 | Game of Thrones (season 3) | 600,721 | See #3 above |
6 | List of Game of Thrones episodes | 590,697 | see #3 and #5 above |
7 | 580,390 | A perennially popular article. | |
8 | Edward Snowden | 576,664 | The PRISM program whistleblower became the major discussion point in the news this week. |
9 | PlayStation 4 | 519,716 | Sony unveiled their addition to the already controversial eighth generation of video game consoles, to positive reception. |
10 | The Last of Us | 500,214 | Another much-anticipated post-apocalypse video game which was released on June 13. |
Notes:
Memeburn.com published an article on the yearning of students in South Africa for free knowledge through Wikipedia Zero. Students from Sinenjogo High School have written letters to four major mobile phone companies requesting access to Wikipedia Zero, but the response entailed "little enthusiasm". According to the article, only 21% of South African schools have libraries and access to computers is very limited:
“ | The group voices its concerns in the letter stating that the South African "education system needs help and having access to Wikipedia would make a very positive difference." The group also says that we should "just think of the boost that it will give us as students and to the whole education system of South Africa." As much as 8 million South Africans have access the internet on their cellphones. | ” |
Managing editor of WorldWideWorx.com Arthur Goldstuck agrees. He said that giving kids free access to Wikipedia would go a long way to solving some of South Africa’s education problems.
When asked about the specific request of the students, as well as the future of open educational resources on Wikipedia, Kul Wadhwa, Head of Mobile and Business Development for the Wikimedia Foundation (which encompasses Wikipedia Zero) called the students inspirational, saying "We were truly inspired by this grass roots movement, and we hope that this will open up a larger dialogue about the need to make open educational resources available to everyone in a way that can be delivered to them. This is really what Wikipedia Zero is about."
In an article by IOL SciTech, the author discussed the visit by WMF storyteller Victor Grigas to the high school where he filmed a documentary about their efforts, which will be available later this year. Grigas was quoted in the article as saying "the learners are so sharp and determined to better themselves. The teachers were amazing too. You can’t spend a day there and not feel inspired." Grigas also posted to the Wikimedia-l mailing list on June 19 asking for collaborators on this project.
Israeli newspaper Haaretz reported on the recent indefinite block of Soosim (talk · contribs), described as "Arnie Draiman, a social-media employee of NGO Monitor". The story, also carried by France 24, says Draiman edited English Wikipedia articles on the Israeli–Palestinian conflict "in an allegedly biased manner".
“ | Draiman concealed the facts that he was an employee of NGO Monitor, often described as a right-wing group, and that he was using a second username, which is forbidden under Wikipedia’s rules. [...] A discussion of the complaint against NGO Monitor’s employee on Wikipedia shows that he promoted his company’s agenda as much as the organizations he worked against promoted theirs, as journalist and blogger Yossi Gurvitz also wrote. | ” |
Draiman had been active in Wikipedia for several years, but had increased his participation in 2010 after taking a position at NGO Monitor, on whose website he is listed as the member of the Communications Department responsible for online communications. At 91 edits, he was the most frequent editor of the Wikipedia article on NGO Monitor, which he began editing in May 2010.
“ | The writer who complained about Soosim (user name Nomoskedasticity) also wrote that NGO Monitor has the custom of issuing a press release, waiting until it is quoted in a newspaper, and then quoting the news item in the relevant articles as fact. During the conversation, it turned out that Draiman even explained this during a workshop he gave on Israel advocacy in which he called on pro-Israel advocates to join the "wiki war." | ” |
Wikipedia administrator Jan Nasonov told Haaretz that biased editing of organisations like NGO Monitor is "unfortunately not all that uncommon on Wikipedia", pointing out that it is difficult to prove. Neither NGO Monitor nor Draiman provided a comment to Haaretz, though Draiman, who had revealed his name to another user on Wikipedia five years ago, before his employment with NGO Monitor, disputed the sockpuppet and meatpuppet allegations against him on Wikipedia and stated that his edits were in compliance with Wikipedia rules.
This week, we visited WikiProject Tennessee, a project dedicate to the state at the geographic and cultural crossroads of the United States. Started in December 2006, WikiProject Tennessee has grown to include 21 pieces of Featured content and 29 Good articles. The project has a lengthy to-do list, taskforces dedicated to Chattanooga and state routes, a listing of the project's most-viewed articles, and Article Alert notifications. We interviewed Doncram, Orlady, Bms4880, and Theopolisme.
Next week, we'll catch up with the latest trends. Until then, strut down to the archive.
Reader comments
With Erysichton elaborata, the Swedish Wikipedia passed the one million article rubicon this week, following closely on the heels of the Spanish Wikipedia last month. While this is a mostly symbolic achievement, serving as a convenient benchmark with which to gain publicity and attention in an increasingly statistical world, the particular method by which the Swedish site has passed the mark has garnered significant attention—and controversy.
The Swedish Wikipedia, alongside the Dutch and much smaller Wikipedias, is one of the few to allow bots—semi-automated or automated programs—to mass-create articles. Using this method has allowed them to leap from about 968,000 articles in May to about 1,044,000 now, with about 454,000 of them being bot-created. This puts them as the fifth-largest Wikipedia, up from ninth just one month ago, and the same method has pushed the Dutch past the Germans, who had long held the title of second-largest Wikipedia. By comparison, the Polish Wikipedia, which had a similar total to the Swedish in May, is now at 973,000 articles.
The Dutch and Swedish totals come despite their far smaller userbases—for example, the Germans have an active userbase that is five times the size of the Dutch and eight times the size of the Swedish. By the same metric, the Polish are twice the size of the Swedish.
The bot-created articles themselves are basic enough: they are about four sentences long, with an infobox and sources from a common database. Each article is tagged with {{Robotskapad}}
a template that notes its origins. Before it received attention for the achievement it represents, Erysichton elaborata provides an excellent example.
The Signpost contacted the bot operator, Lsj, for his thoughts. He told us that the idea for bot-created articles came from the Dutch Wikipedia and an idea mentioned on the Swedish equivalent of the Village Pump in early 2012. While a "handful" of editors were "adamantly opposed", the great majority were in favor. Several smaller trials were conducted before the large-scale project that led to the millionth article, including on birds and sponges.
He told us that bot-created articles can offer significant benefits to Wikimedia communities: "human minds should not be wasted on mind-numbing tasks that a machine can do equally well. Let the machines do the grunt work, and let humans do what requires real intelligence." Bots are also better and far faster at repetitive tasks than humans, who can inadvertently introduce errors. Any bot errors, which in an ironic twist are typically kindled human mistakes, can usually be fixed by a second bot run, similar to what Lsjbot will be doing to add images to the biological articles it has created.
The very concept of bot-created articles, though, has garnered significant opposition in the Wikimedia community as a whole, particularly from German Wikipedians. The prominent editor Achim Raschka authored a piece in the German-language news outlet Kurier. He lamented the Swedish Wikipedia's "bitter" milestone, which puts a spotlight on an article that has little more than "their existence and taxonomic pigeonholing" and omits key information like where the species lives or what it does. Raschka told the Signpost that these stub articles impart little useful information to readers—he asks, "who could be helped with [these] fragment[s] of data?" He also pointed at an entry Denis Diderot wrote for the Encyclopédie, titled "Aguaxima":
“ | Aguaxima, a plant growing in Brazil and on the islands of South America. This is all that we are told about it; and I would like to know for whom such descriptions are made. It cannot be for the natives of the countries concerned, who are likely to know more about the aguaxima than is contained in this description, and who do not need to learn that the aguaxima grows in their country. It is as if you said to a Frenchman that the pear tree is a tree that grows in France, in Germany, etc . It is not meant for us either, for what do we care that there is a tree in Brazil named aguaxima, if all we know about it is its name? What is the point of giving the name? It leaves the ignorant just as they were and teaches the rest of us nothing. If all the same I mention this plant here, along with several others that are described just as poorly, then it is out of consideration for certain readers who prefer to find nothing in a dictionary article or even to find something stupid than to find no article at all. | ” |
... the bot is always right, uses a neutral language, forms complete sentences, provides verifiable facts and makes no trouble, unlike us human authors. It knows ... correct formatting, rarely [vandalizes], addresses no other authors offensively, sought no barrier tests, never complains and is easily turned off without resistance. There are no bots with gender bias and of course no problems with the author leaving the site. If in any topic people are missing, there is no problem, as the programming of a few new bots by specially trained bots, perhaps with steward rights, proceeds rapidly. They are absolutely reliable even with a vote. ... We simply need to take note: Bots are better Wikipedians, our days are gone. We have only consumption, sex and drugs. But this does not have to be bad, right?
A separate Kurier article by Schlesinger, which hyperbolically compared the bot-created articles to the famous novel Brave New World and claimed that bots can and will replace human editors, is a non sequitur. While bots can create article shells and—as can be seen on the Swedish Wikipedia—even short stubs, they can never be programmed to mass-create detailed articles capable of becoming featured or even good articles.
There was also extensive discussion on the Wikimedia-l mailing list and a Wikipedia blog post.
Lsj was unaware of the wider German-language attacks on bot-created articles, but after examining them, found that they were principally based in deeply held principles, making them difficult or impossible to provide an effective counter-argument.
In reply to Hubertl's sarcastic mailing list post, Lsj commented that the statistics, including view counts, editor numbers, and participation, contradict Hubertl's argument.
Still, a major problem could come from human error. Lsj acknowledges that source materials' errors could then creep into articles, but explains this by saying that a second bot run would fix the problem. The obvious rhetorical reply is simple: what if an error only creeps up every so often and is not fixable by bots? What if these errors are not caught until a significant amount of articles are created? A small base of active users may not be able to deal with the required cleanup.
Despite the risks, carefully planned bot-created articles could hold significant benefits for the Wikimedia movement. As Lsj told the Signpost:
“ | Bots are much faster than people at those tasks that bots can do. It is not realistic to expect articles about 50,000 fungi or 100,000 flies to be hand-written within the foreseeable future. [If our slogan is] "imagine a world in which every single human being can freely share in the sum of all knowledge", bots are the only serious option for approaching that vision in the case of thousands and thousands of obscure organisms. [They provide] proper formatting, sources, infobox, categories, etc., right from the start, unlike many hand-written stubs. | ” |
While German-language Wikipedians lament the loss in quality in these programmatic articles, especially when compared to their stringent biology project guidelines, a short article may be better than none at all. This advantage is particularly apparent in smaller languages, whose Foundation projects have few editors and limited sources of information on the Internet, but far less so for wikis with larger userbases and article counts. It remains to be seen if more wikis will choose to bolster their content in this way.
With little more than a day before voting closes for the WMF elections for three community seats on the ten-member Board of Trustees, fewer than 1700 Wikimedians out of a purported 90,000 active editors have turned out to vote—about one in every 50. This compares with a vote of almost 3500 in the last elections for these two-year seats, in June 2011.
The disappointing rate of participation is despite a lengthy pre-election period and almost two weeks of voting, with banners on all WMF sites and reminder emails sent out. The graph shows the day-by-day vote until the time of publication. The typical spurt of interest followed by a rapid fall-off in numbers occurred twice: once at the open of voting on 8 June, and once a week later on 15 June, corresponding to the distribution of email notifications.Risker, a member of the volunteer election committee, commented: "It is lower than I would have expected ... It may be that the active community of 2013 is not as interested in the 'meta' aspects of the Wikimedia movement as in the past, as we have mostly followed the same processes as existed over the past several elections. Or it could be something entirely different. It's generally much harder to figure out why people don't do things than why they do them."
Of the 1659 votes cast at the time of writing, 592 (35.7%) are from English-language sites, 221 (13.3%) German, 157 (9.5%) Italian, 153 (9.2%) French, 82 (4.9%) Spanish, 55 (3.3%) Commons, 48 (2.9%) Polish, 41 (2.5%) Chinese, and 310 (18.7%) from all other languages.
Other languages on the radar are Japanese (27 voters) and Indonesian (12)—both welcome signs of the beginnings of a closer engagement with the worldwide movement—and Hebrew (10), Finnish (9), Danish (7), and Norwegian (7).
A notable disappointment is Hindi, with one voter out of some 200 million native speakers and a significant number of second-language speakers—the fourth-most-spoken language in the world—and an active and growing offline movement in the subcontinent.
Arabic, counting all dialects, has well over 400 million speakers, including 300 million native speakers, but managed to garner only four voters; this is despite a marked shift from the English and French Wikipedias to the Arabic Wikipedia in Arabic-speaking countries, and a successful start to a WMF education program in Egyptian universities.
Editors can vote until UTC 23:59 Saturday 22 June, by clicking on this link to the SecurePoll interface. Instructions on voting and information about candidates is at Meta. The close of voting corresponds to Saturday afternoon to evening in the Americas, before sunrise on Sunday morning in the Subcontinent, and early to late Sunday morning in East Asia and Australia/New Zealand.
“ | Our school does not have a library at all so when we need to do research we have to walk a long way to the local library. When we get there we have to wait in a queue to use the one or two computers which have the internet. At school we do have 25 computers but we struggle to get to use them because they are mainly for the learners who do CAT (Computer Application Technology) as a subject. Going to an internet cafe is also not an easy option because you have to pay per half hour. 90% of us have cell phones but it is expensive for us to buy airtime so if we could get free access to Wikipedia it would make a huge difference to us. | ” |
Eleven articles gained featured status this week:
Twelve lists gained featured states this week:
Eleven pictures gained featured status this week:
This is mostly a list of Non-article page requests for comment believed to be active on 19 June 2013 linked from subpages of Wikipedia:RfC, and recent watchlist notices and SiteNotices. The latter two are in bold. Items that are new to this report are in italics even if they are not new discussions. If an item can be listed under more than one category it is usually listed once only in this report. Clarifications and corrections are appreciated; please leave them in this article's comment box at the bottom of the page.
(This section will include active RfAs, RfBs, CU/OS appointment requests, and Arbcom elections)
“ | In May:
|
” |
—Adapted from Engineering metrics, Wikimedia blog |
The WMF's engineering report for May was published recently on the Wikimedia blog and on the MediaWiki wiki ("friendly" summary version), giving an overview of all Foundation-sponsored technical operations in that month (as well as brief coverage of progress on Wikimedia Deutschland's Wikidata project and Wikimedia CH's Kiwix offline reader project, which the report noted, recently released its first version for Android). Although the ten headlines items will be the major focus of this "Technology report", the WMF-led publication also contains a myriad of updates about smaller initiatives which interested users should peruse at their leisure.
As has been the trend in recent months, the choice of headlines mirrors the use of blogposts on the Wikimedia Techblog. Among the teams to blog the most, the Foundation's Language Engineering team wrote of their efforts to attract an intern, deploy the UniversalLanguageSelector, and make it easier to internationalise an external MediaWiki installation. Another busy team was that focussed on the Foundation's "Wikipedia Zero" project, aimed at giving free access to Wikipedia in developing nations via portable devices. The team reported that during May they had "[worked to launch] Wikipedia Zero in Pakistan, refactored its legacy codebase, migrated configuration from monolithic wiki articles to per-carrier JSON configuration blobs, generated utility scripts, patched legacy hyperlink redirect and content rendering bugs, and supported partner on-boarding" against the backdrop of widening adoption. Finally, the Foundation's soon-to-be-flagship project to improve talk pages, Flow, entered its community consultation phase during April.
Highlights from last month's report, which the Signpost did not report extensively at the time, included details on an area that the Foundation has recently begun to hire in – multimedia engineering – with the commitment to ensure that "contributing an image or video to an article while you’re editing does not require leaving the “edit mode”; as this month's report notes, however, the Foundation is still having to fix bugs in its media handling backend, as well as its core TimedMediaHandler video player, which appear to be more likely targets for development in the interim. A second featured another cornerstone project, Wikidata, in the wake of news that Russian technology firm Yandex is to donate €150,000 to support its development. Entitled "The Wikidata Revolution", the blog post details the march of Wikidata's second (infobox) phase, while the Wikidata team has more recently announced progress integrating new datatypes, including date-time and geocordinate displays.
Though neither monthly report commented greatly on any disappointments the Foundation has had over the past two months, it is clear that many of the perennial concerns – project delays, variable community resistance, and code review – remain ever present worries. Commenting on the last of these, the report noted that WMF Technical Contributor Coordinator Quim Gil has been "preparing a proposal to get automated community metrics" with the potential to help the Foundation better understand the health of the volunteer community given the spiraling number of unreviewed (but still open) commits.
Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.
Richard Farmbrough was set to have his day in court, but as events transpired, this was not to be so. On 25 March 2013, an accusation was made against Farmbrough at Arbitration Enforcement (AE), claiming that he violated the terms of an automated edit restriction. Within hours, Farmbrough had filed his own request with the arbitration committee, citing the newly filed AE request and claiming that the motion was being used "in an absurd way" in the filing of enforcement requests: "I have not made any edits that a sane person would consider automation."
The AE arm of the arbitration committee blocked Farmbrough for one year, after receiving a go-ahead from arbitrator T Canens and without waiting for input from either Farmbrough or the community. The committee, noting that Farmbrough was blocked, then declined to consider Farmbrough's request.
Richard Farmbrough is something of an icon in the Wikipedia saga. In 2007, Smith Magazine interviewed him as one of the most prolific editors on Wikipedia. In 2011, he was cited by R. Stuart Geiger in "The Lives of Bots" as the creator of the {{nobots}} opt-out template and an advocate of the "bots are better behaved than people" philosophy of bot development. Farmbrough is also credited with coining the word "botophobia", to make the point that bot policy needs to be as responsive to public perceptions as to technical considerations. Farmbrough described himself to the Signpost as "a reader and sometime editor and administrator of the English Wikipedia ... [I've] contributed to and started many articles, worked on policy, edited templates, created and organised categories, participated in discussions, helped new users, run database extraction, created file lists and reports for Wikipedians, done anti-vandal work, and was a host at Tea-house. I also wrote and ran bots."
All of the bots' tasks were approved by BAG, the Bot Approvals Group, "although in the less restrictive environment of 2007 a more liberal approach was taken to 'obviously' good extensions of existing tasks than was later the case." Before being submitted to BAG's testing regime, bot tasks underwent a significant amount of manual testing. In one typical case, Farmbrough manually checked and saved more than 3000 edits over the course of six or seven weeks.
None of Farmbrough's bots are currently running. Some of the code and data from his bots is used in other bots, such as AnomieBot and AWB-based bots. AnomieBot has taken over some of Helpful Pixie Bot's dating tasks, but the other general fixes are not being performed.
So what went wrong? "In September 2010 I made some changes to the general clean-up, there was some opposition and I agreed to revert the changes ... However, an avalanche had been unleashed, and the matter was escalated to ANI. Subsequently I removed all custom general fixes, and rewrote the entire bot in perl, since AWB at that time could not meet the exacting standards that were being demanded. ... One would think that having agreed to do everything asked, and even gone beyond it, the matter would have rested there; but a series of ANI and ARB filings ensued, some rejected out of hand, others gaining traction until by mid-2012 it had become impossible to edit."
As one observer put it, "What we are seeing here is 'The War of the Dwarves and the Gnomes'. Dwarves are editors who work mainly on content, and typically put a lot of thought into each edit; gnomes are editors who work mainly on form, and tend to make large numbers of edits doing things like changing a - to a –. Richard is a Supergnome, and the comparatively small fraction of errors generated by his huge volume of automated edits ended up costing the dwarves who maintain articles an enormous amount of time. Eventually, after repeated failed attempts to rein him in, the outraged dwarves banded together to ban him."
The outcome of the 2012 Rich Farmbrough arbitration case, along with its subsequent motions, was not at all in his favor. It contained the wording of the automation restriction that has become so controversial: "Rich Farmbrough is indefinitely prohibited from using any automation whatsoever on Wikipedia. For the purposes of this remedy, any edits that reasonably appear to be automated shall be assumed to be so." A later "amendment by motion" stated "Rich Farmbrough is directed ... to make only completely manual edits (i.e. by selecting the [EDIT] button and typing changes into the editing window)".
The Arbitration Enforcement administrator, however, stated that "it appears very improbable that this sort of repetitive change was made without some sort of automation, if only the copy/paste or search/replace functions (which are forbidden under the terms of the decision, which prohibits 'any automation whatsoever')", and defined "find and replace" as automation because "it produces the effect of many keystrokes with one or few keystrokes". If "search and replace" is automation, replied the commenters, then so is "copy and paste" or signing posts with four tildes. Farmbrough pointed out that caps-lock also fits the definition of producing the effect of many keystrokes with one keystroke.
What interpretation of "automated edits" is reasonable? We asked Farmbrough if some automated edits are potentially damaging and others not:
“ | In order to establish a useful definition of "automated" we should establish why we want it. The putative reason is that automation "gone wrong" allows creation of many many errors that us poor humans cannot deal with. This is false, though, as I have inspected about 176,000 edits of HPB looking for a particular error; it took maybe a few hours—try inspecting that many human edits. Moreover, automated edits can be rolled back and reapplied in an emergency. Nonetheless, granting that people have reservations, [the definition of automation] should clearly be looking edits that are repetitive, high speed and affect many pages. | ” |
It has been suggested that this will have a chilling effect on other bot operators, that they will be afraid of making mistakes and getting banned. Says one talk page commenter, "A lot of bot ops and potential botops think twice before starting a bot. I have talked with several editors who want too but are afraid if they make mistakes that the zero defect mentality will get them banned."
Arbitrator T. Canens responded:
“ | Obviously bots cannot run if the botop is blocked/banned. However, nothing prevents other botops from taking over Rich's bots, provided that they comply with all relevant policies and guidelines, including promptly addressing any concerns about the bot raised by other members of the community. Bot defects are unavoidable (though I'm not sure if there's any statistics documenting exactly how frequent it is). The point is that botops need to be responsive to community concerns and promptly fix any reported defects. We have many bot operators, but to the best of my knowledge only a very small number were ever blocked/banned over bot-related issues (Rich and Betacommand are the only two that come to mind). | ” |
We did not think to ask whether sub-optimal edits are beneficial, as long as they move the project forward, but both Farmbrough and T. Canens identified this as an issue.
Said T. Canens, "It is very clear to me that the committee in both the initial sanction and the subsequent motion intended to ban all forms of automated editing whatsoever from Rich, regardless of whether any particular automated edit is beneficial. In general, this happens when the Committee determines that 1) the disruption caused by the totality of the automated editing outweighs the benefits of said editing and 2) there is no less restrictive sanction that is both workable and capable of preventing further disruption. In this case, for instance, given the high volume of Rich's automated edits, a remedy that only prohibits him from making problematic edits would be impractical."
Farmbrough stated, "What we should be concerned about is the encyclopedic project, is something someone is doing damaging or benefiting the project? If it is damging we should look at steps to address that, if it is benefiting we should look at ways to improve it further."
The Arbitration Enforcement request against Farmbrough was initiated at 10:29, 25 March 2013, and closed less than 13 hours later, at 23:04, with only the accuser and the AE administrator participating. After a request to leave the case open a little bit longer for discussion was declined, discussion continued on Sandstein's and Rich Farmbrough's talk pages.
T. Canens' statement at Farmbrough's Arbcom request that "I think the AE request can proceed as usual", and Richard's subsequent block, received comments at various talk pages ranging from "[it is] somewhat strange that T. Canens should encourage blocking of an editor who has made an appeal to ArbCom" to "the comments from arbitrators seem to say 'block him, we're not going to change the sanction' (T. Canens) and 'we're not going to change the sanction because he's blocked' (Carcharoth and Risker)."
"I was amazed that one arb suggesting Sandstein go ahead was considered authority to do so," Farmbrough told the Signpost. "Even more at the circular argument 'Rich is blocked so the request to remove the provision he was blocked under is moot'".
We asked arbitrator T. Canens why he had Farmbrough blocked while his Arbcom request was still open.
“ | The filing of an amendment request to lift a sanction is by itself insufficient to delay an AE request seeking the enforcement of said sanction; were it otherwise, people could file tons of meritless amendment requests in the hopes that they'll delay the AE request long enough to get it closed as stale. There needs to be at least a reasonable probability that a majority of arbitrators would in fact grant the appeal to justify delaying action on the AE thread, but Rich's appeal is very unlikely to be granted, as the committee views with disfavor 1) multiple appeals in a short time period, and 2) appeals to lift a sanction that has been recently violated, as Rich's appeal fits both. Note that any AE block would not prevent Rich from emailing the committee with any additional comments on his request. | ” |
There was also some disagreement over the intentions of the arbitration committee with regard to automation and role of AE.
According to one interpretation of the Farmbrough arbitration case, "it isn't the automated editing itself that is harmful/disruptive, and if there is no harm being done here then the 1 year block does not prevent any problems. So in that sense it is neither punitive nor preventative!" and "the Enforcement By block section says 'may be blocked...' which I can't read any other way than to imply that some discretion is given to administrators to not block or to block for a shorter period when, for example, the infraction was so exceedingly minor or when there is no or very little disruption."
According to another view, "the underlying decision of the Arbitration Committee to consider all automated editing of whatever nature by Rich Farmbrough to be harmful, and to ban all such editing. ... Because Arbitration Committee decisions are binding, AE admins in particular have no authority to question the Committee's decisions; they must limit themselves to executing the decisions."
We asked T. Canens if, under these circumstances, "the arbitration committee needs to clarify their intentions about automation and mass editing". Canen replied:
“ | Admins are volunteers and are free to refrain from taking action on any AE request if they do not want to, though they are not free to overturn another admin's AE action except under certain limited circumstances. For example, AE has, on occasion, declined to block for isolated 1RR violations when the edit being reverted was unquestionably problematic, even if it doesn't fall in one of the xRR exemptions. If Rich's edit were indeed completely correct, this fact might justify letting this particular violation go. On the other hand, the wilful nature of Rich's violation and his history of violations of the restriction would counsel against overlooking the violation. It is up to the AE admins to balance these competing considerations; arbcom generally does not interfere with AE in the absence of clear error or new developments. | ” |
"I just want to get back to editing" says Farmbrough. "Wikipedians do not edit for thanks and barnstars, though they are both nice to receive. It is however a big disincentive to edit, and part of the hostile environment, when there's a constant (and I do mean constant) threat hanging over every editor's head that they're going to have to spend days and weeks fighting off ANI threads and Arbcom cases every time they do something that someone doesn't like."
Given the absence of any other formal mechanism for dealing with automation disputes, that may be exactly what will happen once the block is over.
Reader comments