Note: This is a condensed version of a report that will be included in the upcoming edition of the Wikimedia Quarto.
Wikipedia is now the most popular reference website on the internet in terms of market share, according to data disclosed last week. Confirmation of this milestone came thanks to a report donated to the Wikimedia Foundation by Hitwise, a company that tracks internet and search engine usage.
According to data provided by Hitwise, the Wikipedia.org domain surpassed Dictionary.com as the most popular reference site for US internet surfers in late May. In the two months following this achievement, Wikipedia has strengthened its lead by nearly two percentage points. Wikipedia's market share in the "Education - Reference" category now stands at 5.86%, while Dictionary.com has dropped to 4.01%.
Also of significance is that Answers.com, which includes content from Wikipedia as well as other sites, has gone from not registering at the beginning of 2005 to a market share of 2.33%. Among other encyclopedias, Encarta comes in at 1.32%, while Britannica does not appear in the results, presumably because its content requires payment to access.
Alexa traffic rankings also track individual categories, including one for "Reference". There Wikipedia ranks second, but the logic of the category is difficult to understand, since the first spot is held by My Yahoo!, which is really a customizable portal page.
The Hitwise report indicates that of all traffic coming in to Wikipedia from outside the site, 66% originates with search engines. Breaking this down by source shows that out of the search engine traffic, 50% comes from Google and 43% from Yahoo. MSN Search only brings in 3%, an unexpectedly low figure given recent reports on the market share of the search engines themselves. Yahoo performs better than its market share would suggest, which is not surprising given that Wikipedia was included in Yahoo's content partnership program over a year ago. MSN, which is closely tied to Encarta, obviously has no such relationship, although Wikipedia articles do seem to show up reasonably normally in its search results.
Wikipedia's visibility on Google has increased significantly in recent months, as the Google web crawler went from having 3 million Wikipedia pages indexed in March to more than 25 million in July. Pages from the Wikipedia.org domain currently occupy 0.32% of all pages in Google’s index, a number that matches up closely with the proportion of search engine traffic that Wikipedia attracts, which is two-tenths of one percent. Wikipedia is now the 22nd most recommended site among search engines.
Among the thousands of keywords for which Wikipedia ranks high in search engine results, several in particular consistently draw considerable traffic. These include Bobby Fischer, Jacques Chirac, Vladimir Putin and, most notably, Pope John Paul II. With Wikipedia having been established as a popular source of information about the papacy, the election of Pope Benedict XVI caused a large influx of traffic, peaking at 2,100 requests per second. The number was spectacular at the time, but as hardware capacity has expanded, only two months later it is common for the servers to receive over 2,500 requests per second.
The Encyclopædia Britannica made a response to the challenge it faces from Wikipedia last week, getting some publicity for its announcement of the formation of an editorial advisory board. This group of luminaries is supposed to help maintain Britannica's standards "while making sure it remains relevant to the way people use information today."
The announcement was the focus of a detailed article by Eric Ferkenhoff in the Boston Globe, "Venerable encylopedia seeks just the facts". Ferkenhoff painted the move as an effort to reassert authoritative sources of information "in an age when the Internet has loosened the definition of what is factual." The story was also covered more briefly in the Washington Times [1].
As reported by the Globe, the board "will meet twice a year to plot the direction for Britannica and fine-tune its editorial content". The idea of an advisory board is not new, but it had not been in operation at Britannica for over a decade. Wendy Doniger, the only holdover from the previous board, indicated that it last met in 1995, falling out of use shortly after the debut of Britannica's online edition.
Although the board was announced with a press release on Thursday, the composition of the board has already been available on Britannica's website for some time. In fact, the Wikipedia biographies of a number of the board members were updated in June to reflect their participation.
Britannica billed the advisory board as comprising "fifteen of the world's leading scholars and intellectuals". Besides Doniger, the members of the new board include Rosalía Arteaga, David Baltimore, Benjamin M. Friedman, Leslie Gelb, Murray Gell-Mann, Vartan Gregorian, Zaha Hadid, James M. McPherson, Thomas Nagel, Donald Norman, Don Michael Randel, Amartya Sen, Wole Soyinka, and Lord Sutherland.
Wikipedia should perhaps be embarrassed that it does not have an article on Gelb at this writing (Update: a new article now exists), and that a number of the articles are nothing more than stubs. Then again, a search of Britannica's online edition revealed that it only contained articles for five of its own board members — Baltimore, Gell-Mann, Hadid, Sen, and Soyinka. They do, however, all have brief biographies available on the corporate information page about the editorial board.
Some observers wondered how much substantive improvement to Britannica's product would result from this new announcement. Jeremy Wagstaff noted that the Globe article quoted Jimmy Wales in addition to the Britannica's chief editor and some of the board members, but suggested that Britannica was allowed to "put a bit too much of their spin on the story." David Weinberger said, "It seems to be primarily a PR effort".
Writers in the media showed that they were positively impressed by Wikipedia, except for a town concerned about its image. The favourable reviews included one from a site sometimes identified as a competitor.
Detroit Free Press writer Mike Wendland wrote about Wikipedia this week and formed a very favourable opinion, describing the encyclopaedia as 'unlike any other one-stop reference you've ever consulted' [2].
Wikipedia dominates news coverage of Wikimedia Foundation projects, but Wendland mentioned Wiktionary, Wikiquote and Wikispecies in his article, and said he had found 'no greater net resource for wasting time' than Wikipedia and related sites. Wendland quoted facts from Medieval hunting and Ninety Mile Beach as examples of the kind of 'obscure trivia' he enjoyed finding in Wikipedia, and said that the abundance of links in each article meant that 'before you know it, the birds are singing outside your window and you stayed up all night'.
Wendland considered the common criticism that because anyone can edit Wikipedia it cannot be entirely reliable, and quoted from the project's own consideration of this criticism which acknowledges that the encyclopaedia contains much 'well-meaning, but ill-informed and amateurish work' but notes that the project is very much a work in progress rather than a finished project.
The article noted the similarities between Wikipedia and h2g2, a collectively written 'unconventional guide to Life, the Universe and Everything' and brainchild of Douglas Adams, and looked at the growing influence of wiki-based sites on the Internet, but said that Wikipedia is 'clearly the standard for open-source information'.
One of Wikipedia's main competitors in the internet knowledge resource market, About.com, has looked at the remarkable growth of Wikipedia since its launch.
http://websearch,about,com/b/a/188129.htm
. I am also strictly forbidden to even mention the name of the website, so you must imagine that this next redacted part said "About dot com blogger".[murderously evil, unmentionable, heinous name of world's most terrifying website redacted]
blogger Wendy Boswell declared herself 'not a good graph reader' but still able to see that Wikipedia's growth charts were 'pretty dang cool'. However, Boswell perhaps misunderstood the aim of the project when she said that 'So if you look up something about cats and find that you are not agreeing with the person who stated that cats can't use a vacuum cleaner, you can add your opinion' - Wikipedia being a neutral encyclopaedia rather than a place for people to propose ludicrous ideas.
The Belfast Telegraph reported this week on the plight of County Antrim town Ballymena, which it claimed had been 'ridiculed in the biggest open access encyclopaedia on the internet' [3]. The Telegraph underestimated Wikipedia's size by a factor of ten when it stated that the project 'currently carries about 50,000 articles', but found the Ballymena article somewhat deficient. Some content appeared to be sarcastic, claiming that the town contained two of Europe's best-kept housing estates, and urging visitors to visit "the green and gentle pastures of the Doury Road and Ballykeel".
More controversial was a claim that the town is the heroin capital of Europe, something disputed by the mayor of the town, Tommy Nicholl, who said the claim was "ludicrous and not true" and said "Why doesn't whoever wrote this highlight good things about Ballymena?", although he acknowledged that the town did face a drugs problem. The Telegraph quoted Encyclopædia Britannica executive editor Ted Pappas as saying "The premise of Wikipedia is that continuous improvement will lead to perfection. That premise is completely unproven".
Since the publication of the Telegraph article on 22 July, the article has seen several edits, which have removed the sarcasm and dubious claims which the Telegraph highlighted, and also added sources and external links, which were previously absent.
The editor of the BBC News website Pete Clifton writes a weekly column for the site, but recently handed over for two weeks to a replacement chosen from the general public. Researcher Ed Moran from Oxford took over, and in his first column remarked on his surprise and awe at finding that Pete Clifton had an entry [4], although he suggested that Clifton had probably written the article himself.
In his second column [5] Moran reported one reader as writing in to say "Dude. It's Wikipedia. Anyone can be in it", and an article on him duly sprang up. Moran acknowledged, though, that while Pete Clifton had survived a vote for deletion, he was unlikely to do so, and his prediction was borne out - Neutrality put the article forward for deletion and the vote looks set to remove Ed from Wikipedia.
Among the citations of Wikipedia articles in the press this week: University of Illinois newspaper The Daily Illini quoted from confirmation bias in an article on the flaws inherent in scientific studies [6]; the Cincinnati Enquirer cited the existence of an article on Coingate as evidence that a worker's compensation fund scandal in Ohio would shape the state's political future [7]; The Times discovered that the Irish equivalent of the British chav is a scanger [8]; and an article published in several Canadian newspapers looked to Wikipedia to expose the history of the word metrosexual [9].
Scholars and writers last week pursued an active dialogue that explored several theoretical approaches to understanding how Wikipedia works. Lawrence Lessig's blog served as a focal point for these discussions, but the theme was taken up elsewhere as well.
Although the most active conversation came on Lessig's blog, the impetus for it actually derived from his absence. Lessig takes a month off from internet activity each year to spend time with his family, and he started his vacation this past week. Among the guest-bloggers who will be filling in for him is Jimmy Wales, who is scheduled for two weeks later in August.
The discussion was precipitated by Lessig's first substitute, law professor Cass Sunstein, who spent quite a bit of time discussing Wikipedia as a mechanism for aggregating information. His emphasis was on comparing Wikipedia and similar phenomena with the way prices in a market economy also have a function of collecting information. Sunstein asked, "When, in particular, will wikis and the blogosphere fail as mechanisms for aggregating dispersed information?"
Wales revealed that he considers Friedrich Hayek's work on price theory "central to my own thinking about how to manage the Wikipedia project." The analogy to prices prompted consideration of one particular economics-based tool for aggregating information, prediction markets. This turned attention toward the speculation then raging about whom George W. Bush would be nominating for the US Supreme Court vacancy, an example of the type of news event for which Wikipedia is a popular source (see related story).
In response to Sunstein's challenge, a few people questioned whether wikis even succeed at aggregating information in the first place. Sunstein posited that it depended on the Condorcet jury theorem: In a group whose members have more than a 50% chance of being right, the likelihood of their average answer being correct approaches 100% as the number of members increases; meanwhile, if their chance of being right is below 50%, increases in group size cause the likelihood of the average being right to approach zero instead. He noted, however, that market systems would probably work better because they provide additional incentive to get the information right.
Another approach to understanding Wikipedia came last Friday from Rohit Gupta in the USC Online Journalism Review. He describes Wikipedia as "a sprawling metropolis under construction by purpose-driven swarms", finding a particular parallel to the city of Mumbai (Bombay). Mixing in analogies to Hindu mythology, his analysis is called "The avatar versus the journalist: Making meaning, finding truth". Tlogmer called it "the best article about wikipedia I've read in ... well, ever."
Gupta paints a very colorful portrait, filled with constant activity. To highlight the creative process on Wikipedia, he uses the image of avatars extensively, noting that "it is the central idea of creation in Hindu mythology". The avatar in this case is the persona adopted by individual Wikipedia editors, and Gupta observes, "Each person can be many avatars".
Having brought in the Hindu imagery, Gupta points out the similarity in how its popular mythology developed using methods similar to Wikipedia based on "authorless, collaborative texts". He goes on to make a comparison of online culture to the development of human civilization. In this comparison, he says, wikis mark the point at which society moves out from "cave-dwelling" into a more settled life that "resembles agriculture and farming more than anything else."
To deal with recent resignations, three new members were appointed to the Arbitration Committee last week. Even before the appointments, though, the committee did overcome its recent inactivity and close two existing cases. However, the mentorship system seemed to be struggling as one such arrangement had to be dissolved.
On Friday, 22 July, Jimmy Wales announced that he was appointing James Forrester, Fennec, and Jayjg to the Arbitration Committee. They take the place of Delirium, Ambi, and Grunt, who had previously stated that they would resign once replacements were found. One additional appointment is also expected, as Nohat last week joined the other three in indicating that he would resign.
The newly appointed arbitrators will serve until the end of the year, at which point their positions will be up for election along with the other arbitrators whose terms expire then. The procedure differed from the selection of interim arbitrators Raul654 and Jwrosenzweig to fill similar vacancies last year, as they were chosen in a special election. Wales said, "For an emergency, I don't think we need to go through that whole process." Instead, he chose the temporary arbitrators from a group of names suggested by the existing arbitrators.
Pcb21 criticized the process of selecting the new arbitrators, saying that other options should have been considered besides "hidden behind closed doors, cliquey appointments." However, Raul654 pointed out that the Board of Trustees election had only just concluded, the next arbitration election was only some five months away, and the previous one had been "a horribly nasty experience that no one wants to repeat more often than necessary". Wales acknowledged the concerns but indicated that he thought his selections would be "mostly uncontroversial". He added, "I think that me randomly appointing people midterm is not the proper sustainable way to do this in the long run."
A pair of cases were cleared out without needing the participation of the new appointees. On Thursday, the arbitrators closed their oldest case, a dispute between Tkorrovi and Paul Beardsell over the Artificial consciousness article. A temporary injunction had prohibited them from editing the article while the case was open, allowing other users to work on it.
In the ruling, this prohibition was extended for another three months for Tkorrovi and indefinitely for Beardsell. The arbitrators noted a number of personal attacks in the dispute and placed both on a six-month parole. They also noted that much of the dispute revolved around information that lacked references and was likely original research. Accordingly, an additional parole made Tkorrovi and Beardsell subject to short blocks if they "reinsert any unreferenced or poorly referenced material in the article".
Tkorrovi called the decision a "fair solution" and indicated his acceptance of the arbitration process. Beardsell, however, as the case was about to close, suggested that the arbitrators declare a "mistrial" instead and that everyone simply agree to put the matter behind them. Throughout the process, Beardsell fought the case purely by argument, and never presented any evidence.
In the second case, which was closed on Friday, the arbitrators banned Zivinbudas for one year. They described him as editing "from an immature Lithuanian nationalist perspective" (a frequently-used edit summary read "Polish stupidity"). Due to his use of IP addresses to edit, the ruling indicated that any edits involving his trademark behavior may be reverted.
Meanwhile, three new cases were opened last week. One deals with the conduct of debate over the Race and intelligence article, which recently has nominated both for deletion and featured article status (neither was successful). While this one included complaints of personal attacks, another case was brought against AI based on concerns that he was going too far in removing personal attacks and disrupting debate on talk pages.
The remaining dispute involves Coolcat, who complains of being "stalked" by Stereotek and Davenbelle. In this matter, Tony Sidaway suggested that mentorship for Coolcat might be appropriate. The fate of the mentorship idea is uncertain, however, as one of the existing mentorship arrangements collapsed last week.
Netoholic's last remaining mentor, Raul654, resigned on Tuesday saying that formal mentorship was not working. Arbitrator sannse indicated that Netoholic is now subject to a ban from the Wikipedia and template namespaces, and a restriction to one revert per day, as was ruled previously. Raul654 said the system failed because it left the mentors as the only ones able to intervene, and that problems went unresolved if the mentors were unaware of them. For his part, Netoholic said he thought the mentorship had been working, but it "failed because it meant different things to everyone involved."
Six months after the previous modification of the speedy deletion criteria (see archived story), a new expansion proposal led to the addition of four new criteria and the rewording of one. A vote on this proposal held over two weeks (see related story) ended in the passage of measures that partially address the problem of handling vanity pages, along with some other technical changes.
Specifically, the vote results allow the speedy deletion of articles that do not claim any notability of their subject, serve only to disparage their subject, or simply rephrase the title of the article. In addition, articles that are transwikied in accordance with Votes for deletion consensus may now be speedily deleted afterwards. Furthermore, additional clarification was added to the criteria that calls for the deletion of articles recreated after earlier deletion.
The motivation for this vote came from a sentiment that the votes for deletion process was growing out of hand, and its sheer size would make it prohibitively difficult for people to participate. The push to avoid this focused on restricting votes for deletion to those cases where some doubt remains, so that obvious "keeps" would not be nominated and obvious "deletes" could be expedited. Meanwhile, another emphasis was educating people on the deletion process generally, as well as alternatives to deletion. Efforts have included compiling deletion precedents and guidance on merging pages, as well as an attempt to reach a compromise for the hot-button issue of schools.
To lay the groundwork for the vote, the proposals were developed over a month of discussion and debate. After discarding unworkable proposals and trying to define cases of obviously deletable articles in simple wording, the proponents submitted twenty proposals. These were put up for voting over a two-week period, and considerable effort went into publicizing the vote.
The two most radical proposals were intended to make it easier to get rid of vanity articles - one by Doc Glasgow to allow for deletion of an article on a person that does not assert that person's notability, and another by Uncle G dealing with an article on a person less than twenty-five years old that does not cite a source. Articles about an unremarkable person written by the subject of the article, or his or her friends or classmates, are a regular occurrence and votes for deletion deals with more than twenty such articles per day.
These proposals illustrate two different approaches to the same issue. The notability proposal relies on a common-sense reading of the article to determine if it indicates anything noteworthy about the subject. An example indicated that an article saying "John Doe is good at chess" does not assert notability, while "John Doe won the 2003 UK Chess Trophy" does. This proposal passed with 74% support. On the other hand, the proposal using the person's age to justify deletion was based on an analysis of recent deletion votes. The analysis indicated that an age below 25 and the failure to cite a source provided a reasonable proxy for situations in which deletion was the most likely outcome. However, the measure was voted down with 49% support.
Some proposals related to the problem of vanity pages dealt with unremarkable bands, websites and clubs; none of these passed, as they received 69%, 58% and 37% support respectively. One argument for having a speedy deletion criterion for websites is that regular deletion of these articles often leads to an influx of sockpuppets and new users. At the same time, good criteria for identifying a website as unremarkable are difficult to establish, since objective measures such as Google and Alexa data are frowned upon.
The proposal on bands came closest to passing, as the threshold set for the vote was 70%, but a number of people expressed concern about the standard for deletion used. It would have relied on guidelines set by WikiProject Music, but this drew a number of objections indicating that the criterion for deletion should stand on its own. In an effort to overcome these objections, a discussion is ongoing to consider an alternative.
Other proposals that passed include one for articles moved to another Wikimedia project after an earlier deletion vote; one to clarify when it is and isn't appropriate to delete an article that was earlier deleted and then recreated; one to formalize the practice of deleting attack pages; and one to allow for deletion of articles that have no content other than a rephrasing of the title.
Other proposals that received majority support but fell short of 70% included those for articles duplicated in Wikibooks or Wikisource, and deletion of articles related to fan fiction. The proposal for deletion of characters from role-playing games failed, partly because it did not distinguish between publisher-created and player-created characters. Also failing were proposals for articles of one sentence or less, along with duplicates of Wiktionary articles.
A number of proposals did not relate directly to votes for deletion, but instead to other aspects of the deletion system that do not necessarily have the same load problems. These proposed criteria for speedy deletion, which related to issues such as images, categories, templates, and copyright violations, were all rejected.
With the large scale of the vote and preceding discussion, the proposal encountered criticism from a number of people. Complaints aired included that the six-week period of discussion preceding the proposal had been too short, and that too many proposals went to a vote simultaneously.
Some people expressed disagreement with the practice of adding new proposals while the vote was in progress, and there were objections to the suffrage requirements that would cause all votes from users with less than 250 edits to be discounted. However, none of the late additions received enough support to pass anyway, and the criteria for discounting votes did not appear to affect any of the outcomes. Finally, a counterproposal that the entire proposal was an instance of instruction creep only gained 23% support.
The Wikipedia article on United States Supreme Court nominee John Roberts was the focus of considerable attention last week, and not just in terms of heavy editing. It also inspired a joke that transformed into a rumor circulating in the blogosphere that Roberts might be homosexual, or at any rate a rumor that people might be trying to spread such a rumor.
The incident began Wednesday with a relatively new blogger using the name Manhattan Offender, who highlighted some aspects of Roberts' Wikipedia biography in his post, "How gay is this guy?" Manhattan Offender says that it was entirely a joke, and that he doesn't think the post actually implies that Roberts is gay, which he does not believe anyway.
The idea was nevertheless picked up elsewhere, now amplified by the effect of a Thursday New York Times biographical piece, which both the gossipy Wonkette [10] and the more serious Ann Althouse [11] found laden with subtle clues that Roberts might be a closet homosexual. Both highlighted, among other things, the fact that Roberts had played the part of Peppermint Patty in a high school play, and Wonkette also linked back to Manhattan Offender's post.
Charmaine Yoest [12] called this part of a "whisper campaign" to discredit Roberts with his own supporters, on the premise that they would be disturbed to learn that he was homosexual. From here the story spread to other conservative blogs. Michelle Malkin [13] briefly noted the furor and added, "The 'closet gay' meme is spreading in the swamps", by which she meant some comments on Daily Kos (more speculation from the Kos site). The story also ended up on Power Line [14], which charged "some Democrats" with actively "hinting" that Roberts was homosexual. Jeff Jarvis [15] called this "an absolutely bizarre post".
Although the incident rapidly turned into a partisan debate over the resulting rumor, it had no such overtones when it started, apparently innocent of what would transpire. Manhattan Offender says he knew nothing about other sources of the rumor, but simply read the Wikipedia article and found the similarities to his own background funny enough to write about. He adds that he's not even left-wing, but was registered as a Republican until Bush's endorsement of the Federal Marriage Amendment.
The sequence of events was finally analyzed by Marty Schwimmer to see how the joke turned into a rumor. He noted that as the story was retold some elements were omitted, such as the humorous intent, and emphasis on other details shifted so that "the story is changed to make sense to those spreading the rumor." Both Yoest [16] and Althouse [17] responded with objections about how their roles were portrayed by others. Yoest pointed out that there was no reason for her to think that the New York Times at least was trying to be humorous, while Althouse felt that Power Line in particular had twisted her observations in order to reach its result.
Wikipedia will be the beneficiary of a new revenue stream in the near future, as it was included among the recipients of donations announced last week by the open source nonprofit group LinuxFund.
As reported on Friday, LinuxFund will be supporting Wikipedia with a donation of $6,000 over the next year. The announcement marks a renewal of activity by the fund, which had trouble staying on track after the departure of its executive director, but was still bringing in revenue. New Executive Director David Mandel indicated that he was working to catch up on paperwork, and that the group planned to support additional projects in the future.
LinuxFund runs a program to market a Linux credit card and receives a percentage of purchases made by cardholders. The credit card is provided in conjunction with MBNA, a financial services company, in what is known as an "affinity card" program. The money received by LinuxFund is supposed to be redistributed to open source projects.
Wikimedia Foundation Press Officer Elisabeth Bauer told NewsForge, "Like most donations, the money will therefore be used to cover hardware and bandwidth costs, and possibly software development work necessary to provide Wikimedia's free services to the public."
The other projects for which LinuxFund announced donations are the Debian Project and the freenode IRC network. Freenode, which hosts a number of IRC channels related to Wikipedia, had recently been running its own fundraising drive. Some people, however, objected to the fact that a portion of those funds would go toward the salary of its director, Rob Levin.
The potential for this donation to be a part of the Wikimedia Foundation's regular funding will help with planning for the project, according to Bauer. The contributions will be paid out over a one-year time period, and Mandel indicated that the projects will be reviewed annually.
There were 7 new admins, 5 new featured articles, 3 new featured lists, and 6 new featured pictures this week. Other news included the reuse of an image from Wikipedia in the media, and some new international developments.
An image taken from Wikipedia (see left) was included as part of a slideshow accompanying a feature about the periodic table in Slate.
A new tool has been developed as an interwiki link checker, using a bot that looks for articles on other Wikipedias that have an identical title, but no interwiki link. Users can manually check to see whether the interwiki link is appropriate using this tool, and the bot will then go through the results and add the links.
The Scots Wikipedia reached 100 articles last week, just barely less than a month after it was created.
New Wikipedia administrators last week included Redwolf24 (nom), Master Thief Garrett (nom), GregRobson (nom), Hashar (nom), Essjay (nom), Bluemoose (nom), Moriori (nom).
Five articles were promoted to featured status: Helen Gandy, Sharon Tate, Blackface, Căile Ferate Române (which has already made an appearance on the Main Page), and Tooth enamel.
Three new featured lists were designated: List of UN peacekeeping missions, Zimbabwean national cricket captains, and List of space shuttle missions.
Meanwhile, on Featured picture candidates, six images were promoted.