An article is a construct – hoaxes and Wikipedia

Op-ed

An article is a construct – hoaxes and Wikipedia

It's frighteningly easy

Wikipedia's policy on vandalism is complex and extensive. Coming in at 41 KB, it is best remembered by the {{nutshell}} wrapper that adorns its introduction, stating that "Intentionally making abusive edits to Wikipedia will result in a block", a threat carried through more often than not. At just over 5k, the guideline on dealing with hoaxes is comparatively slim, and readily admits that "it has been tried, tested, and confirmed—it is indeed possible to insert hoaxes into Wikipedia". It is not hard to tell which is the more robust of the two policies.

First and foremost, this is a consequence of Wikipedia's transitional nature. The site has become mired somewhere between the free-for-all construction binge it once was, and the authoritarian, accuracy-driven project it is quickly becoming. The days of rapidly developing horizontal sprawl are long gone, swallowed up by the project's own growth; increasingly narrow redlink gaps and ever deeper vertical coverage are the new vogue, spearheaded by the plumping of standards and the creation of such initiatives as GLAM and the Education Initiative. Wikipedia gets better, but it also gets much more specialist in nature, and this has a major impact on its editing body. Explosive growth both in the number of articles and the number of editors, once the norm, has been superseded by a more than halved level of article creation and the declining number of active editors, both besides bullish, frankly unrealistic growth projections by the Wikimedia Foundation.^[4] The project has reached its saturation limit—put another way, there simply aren't enough new people out there with both the will and the smarts to sustain growth—and the result is that an increasingly small, specialized body of editors must curate an increasingly large, increasingly sophisticated project.^[5]

“

Now, it's pretty much up to yours truly to fix most things that are wrong in any article in the topic area I inhabit, and I just don't have the time to do it all. There are other editors in the topic, of course, but they appear to be in the same predicament.

”

— Cla68

A sparser, more specialized editing body dealing with highly developed articles and centered mainly on depth has a harder time vetting edits than a larger, less centric one focused more on article creation. Take myself as an example: while I have the depth of field to make quality tweaks to Axial Seamount, I could never do as good a job fact-checking Battlecruiser as a Majestic Titan editor could, and I cannot even begin to comprehend what is going on at Infinite-dimensional holomorphy. This hasn't mattered much for pure vandalism: the specialization of tools has proved more than adequate to keep trollish edits at bay. But vetting tools have not been so well-improved; the best possible solution available, pending changes, has received a considerable amount of flak for various reasons, and has so far only been rolled out in extremely limited form. On pages not actively monitored by experienced editors, falsified information can and indeed does slide right through; with an ever-shrinking pool of editors tending to an ever growing pool of information, this problem will only get worse for the foreseeable future.

The relative decline in editor vetting capacity is paralleled by the ease with which falsehoods can be inserted into Wikipedia. Falsified encyclopedic content can exist in one of three states, by its potential to fool editors examining it: inserted without a reference, inserted under a legitimate (possibly offline) reference that doesn't actually support the content, and inserted under a spurious (generally offline) reference that doesn't actually exist. While unreferenced statements added to articles are often quickly removed or at least tagged with {{citation needed}} or {{needs references}}, editors who aren't quite knowledgeable about the topic at hand passing over a page are extremely unlikely to check newly added references, even online ones, to make sure the information is legitimate. This is doubly true for citations to offline sources that don't even exist. Taking citations valeur faciale is standard operating procedure on Wikipedia: think of the number of times that you have followed a link through or looked up a paper or fired off an ISBN search to ascertain the credibility of a source in an article you are reading; for most of us, the answer is probably "not many". After all, we're here to write content, not to pore over other articles' sourcing, a tedious operation that most of us would rather not perform.

This is why complex falsifications can be taken further than mere insertions: they can achieve the kinds of quality standards that ought to speedily expel any such inaccuracies with great prejudice. The good article nominations process is staffed in large part by two parties: dedicated reviewers who are veterans of the process, and experienced bystanders who want to do something relatively novel and assist with the project's perennial backlog. In neither case are the editors necessarily taking up topic matters they are familiar with (most of the time they are not), and in neither case are the editors obligated to vet the sourcing of the article in question (they rarely do; otherwise who would bother?^[6]), whatever the standards on verifiability may be. And when a featured article nomination is carried through without a contribution of content experts (entirely possible), or the falsification is something relatively innocent like a new quote, such articles may even scale the heights of the highest standard of all in Wikipedia, that much-worshiped bronze star! Nor are hoaxes necessarily limited to solitary pages; they can spread across Wikipedia, either through intentional insertions by the original vandal, or through the process of "organic synthesis"—the tendency of information to disseminate between pages on Wikipedia, either through copypaste or the addition of links.

Then why aren't we buried?

Readers of this op-ed may well take note of its alarmist tone, but they need not be worried: studies of Wikipedia have long shown that Wikipedia is very accurate, and, by derivation, that false information is statistically irrelevant. Well, if as I have striven to show manufacturing hoaxes on Wikipedia is so strikingly easy, why isn't a major problem?

Answering this question requires asking another one: who are vandals, anyway? The creation of effective, long-lasting hoaxes isn't a matter of shifting a few numbers; it requires an understanding of citations and referencing and the manufacture of references to sources, the positing of real intellectual effort into an activity only perpetuated by unsophisticated trolls and bored schoolchildren, and as it turns out the difficulties involved in making believable cases for their misinformation are a high wall for would-be vandals. And even when real hoaxes are made, studies have shown that Wikipedia is generally fairly effective (if not perfect) at keeping its information clean and rid of errors. Hoaxes have reached great prominence, true, but they are small in number, and they can be caught.

But there is nonetheless a lesson to be learned. Wikipedia is extremely vulnerable. If some sophisticated wash wants to launch a smear campaign on the site, falsification would be the way to do it; and that is something that should concern us. The continual unveiling and debunking of hoaxes long after they have been created is a drag on the project's credibility and on its welfare, and when news breaks out about hoaxes on the site in the media it takes a toll on our mainstream acceptance. This is not a problem that can be easily solved; but nor is it one that should be, as it is now, easily ignored.

Addendum: some highlights

“

The Quazer Beast was a perfectly normal looking article. After its creation it was categorized, copy-edited, and linked to; it was even vandalized once. That's the standard life cycle for an article. Except that the Quazer Beast is a hoax, and it isn't the only one out there ... it is important that we recognize what is a Quazer Beast and what is not.

”

— Society for the Preservation of the Quazer Beast

Sorted by date of discovery, here is a selection of what I consider to be fifteen of the most impactful and notable hoaxes known to have existed on Wikipedia.

November 6, 2003 – February 23, 2004: Uqbar. One of the earliest hoaxes to have been debunked, the kingdom of Uqbar is a historical hoax (a story within a story) that was passed off as real early in Wikipedia's history.
December 2004 – April 2005: Roylee. A referral for comment on four months of activity from a user who "has carried out a sustained introduction of fringe theories and original research into a large number of articles (145 listed at User:Mark Dingemanse/Roylee [defunct]) since December 2004."
May 26 – September 22, 2005: Wikipedia biography controversy. To quote from the article: "a series of events that began in May 2005 with the anonymous posting of a hoax article ... about John Seigenthaler, a well-known American journalist. The article falsely stated that Seigenthaler had been a suspect in the assassinations of U.S. President John F. Kennedy and Attorney General Robert F. Kennedy. Then 78-year-old Seigenthaler, who had been a friend and aide to Robert Kennedy, characterized the Wikipedia entry about him as "Internet character assassination". The hoax was not discovered and corrected until September 2005... after the incident, Wikipedia co-founder Jimmy Wales stated that the encyclopedia had barred unregistered users from creating new articles."
October 5 – 26, 2005: Alan Mcilwraith. A former call center worker who created a new identity for himself as a decorated military man on Wikipedia, complete with an in-uniform portrait (now known to have been bought on eBay). The story hit headlines in April 2006, and the article was recreated—now about the hoax he perpetuated (see Signpost coverage).
? – March 3, 2007: Essjay controversy. The only fabrication on Wikipedia major enough to have a 39k Good article to call all of its own, this was a hoax not in the classical sense—that is, not carried out across the mainspace—but in an extremely prominent editor's falsified credentials; when combined with a poorly timed promotion to ArbCom, the result was a spectacular fireworks display.
November 2005 – 21 June 2007: Baldock Beer Disaster. A disaster in more ways than one; the article appeared on the Main Page as a Did you know? entry on November 25, 2005, and was not rooted out until more than a year and a half later.
November 18 – December 18, 2008: Edward Owens hoax. A fisherman turned pirate who never really existed, created by students as part of a class exercise at George Mason University; now has its own article.
September 13–14, 2010: Roger Vinson. An addition was made claiming that the man in question, a federal judge in Florida, is an avid taxidermist who displays mounted bear heads in his courtroom. When Rush Limbaugh used this erroneous information on his talk show, it sparked a media reaction—a demonstration of how even relatively short-lived pieces of vandalism can be damaging.
Spring 2009 – October 2011: Cohen-Cruse Ruse. "A number of apparent sock puppets seem to be creating an elaborate set of fake pages around a few members of a "Cohen" and a "Cruse" family. It involved a number of completely (very carefully) faked biographies, other faked things (like synagogues) and a lot of associated edits to real pages that attempted to justify and contextualize those fake people." It lasted two years, and a major community clean-up followed.
? – February 15, 2012: Legolas2186. Allegations of impropriety were brought against Legolas2186, a prolific (and supposedly trustworthy) writer with a large number of Madonna-related article credits to his name. As was eventually discovered, Legolas had been manufacturing sources, inventing information, and generally doing as he damn well pleased with his sourcing. A permanent ban and months of clean-up by the community followed (see Signpost coverage).
March 8, 2006 – March 21, 2012: Brierfield, Lancashire. An addition was made claiming that the small town was the primary inspiration for Tolkien's Mordor. By the time it was removed in March 2012, it had been on the page for a good six years.
June 9, 2004 – July 13, 2012: Gaius Flavius Antoninus. Created on June 9, 2004 and lasting eight years and one month before discovery, this purported assassin of Julius Caesar has the honor of being the longest-lasting hoax ever created on Wikipedia. Given the level of dissemination that happened in that time and the prominence of Caesar's (historically classical) assassination, it's also probably one of the most illustrative of the failings of Wikipedian vetting.
September 25 – November 19, 2012: Chen Fang. Chen Fang was the mayor of a small town in China, but he was also a student at an American university who created a fictional article about himself to make a statement about Wikipedian inaccuracy, and his case was cited in a Harvard University writing guideline on the topic. It took seven years and two months for someone to notice.
July 4, 2007 – January 28, 2013: Bicholim conflict. The primary inspiration for this op-ed, the Bicholim conflict is (was) one of the most complex and well-crafted hoaxes to have existed on Wikipedia, and spent half a decade, most of its life, as a supposedly verified Good article. A complete fabrication, in 4,500 words it described a clash between colonial Portugal and the Indian Maratha Empire in an undeclared war that supposedly helped cement Goa's independence (see Signpost coverage).
? – February 1, 2013: Bonō Pusī Kalnapilis. A hoax created on our sister project, the German Wikipedia, that was not discovered to be a hoax until it was selected as a Did you know? entry, spending two hours on the main page before being caught.

Notes

^ See the rough guide to semi-protection.
^ Not to imply that it has been unilaterally successful, but rather that it is quite voluminous.
^ The difference between fabrication and hoaxes on Wikipedia is not strictly defined, as Wikipedia hoaxes are technically articles that are spurious. This op-ed will treat the matter in a wider sense and include smaller bits of misinformation.
^ Per the movement goals of the Strategic Panning Initiative.
^ For more information on the why of Wikipedian editing trends, refer to this op ed: "Openness versus quality: why we're doing it wrong, and how to fix it". For more details on the Wikimedia Foundation's response, refer to this special report: "Fighting the decline by restricting article creation?".
^ Good article reviewers are as much regular editors as the next fellow, which means that they find vetting references about as fun as the next fellow—that is to say, not at all. But see revisions made to the reviewing guideline in light of recent discussion on the topic.

Next "Op-ed" →

In this issue

11 February 2013 (all comments)

Op-ed

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

Nice article Resident Mario. It is a big problem. As Wikipedia is successful, we get more and more vandalism, lies, spam and business promotion. As maintenance instead of building up is unattractive, we lose more and more editors and donated editing time. Wikipedia's future is endangered. --Chris.urs-o (talk) 08:36, 13 February 2013 (UTC)[reply]

The prose in Bicholm conflict might have been "well crafted", but the hoax itself was transparent and easily detectable - had I reviewed the article for DYK, for example, using basic DYK checks I would almost certainly have identified it as a hoax immediately. But neither the GA review nor the (admittedly brief) FAC discussion picked up the problem.

The lesson is really a pretty simple one - be suspicious of any article none of whose major references can be verified online, and for whose content you cannot find any corroboration elsewhere. Gatoclass (talk) 09:18, 13 February 2013 (UTC)[reply]

ROFL! This article previously quoted from the Wikipedia biography controversy article, saying that the hoax had not been discovered and corrected for more than nine months, which is a clear mathematical error (May to September is four months). The "nine months" text was in the main article about the Wikipedia biography controversy article due to unreverted vandalism from November 2012. I've fixed both the mainspace and SP articles, but I guess this op-ed proved its own point. Graham87 11:42, 13 February 2013 (UTC)[reply]

Great, the Chen Fang incident was my fault... Back in 2008, I found out about the hoax from an acquaintance and immediately nominated it for deletion (because contemporary news sources had a different person as the mayor). The hoaxer one day randomly introduced himself to me at work, claimed credit for the page I'd just nominated, and presented me with "evidence" that I am User:Mxn – duh – intending to pressure me to delete the AfD template. He soon deleted the template himself and produced a source that lay behind a paywall (something like Newsbank or ProQuest). It sounded fishy, so on the talk page, I promised to check the source once I got back to campus after my internship, but I never got around to it. Moral of the story: don't procrastinate, or your error will be preserved in Harvard policy for posterity. – Minh Nguyễn (talk, contribs) 12:25, 13 February 2013 (UTC)[reply]

Wikipedia may be the largest and most expansive information compendium the world has ever seen as noted in the article above, but none of Wikipedia's four million articles is a standalone article on a cover song as far as I am aware. Not one in Wikipedia's 12 year existence. Is this too a hoax? A recent DRV request to obtain consensus to create Hound Dog (Elvis Presley cover song) as a standalone article was closed as against policy that first needed to be changed: "if you want to change the policy you need an RFC." In other words, Wikipedia content policy prohibits anyone from posting Hound Dog (Elvis Presley cover song) as a standalone article. Really? What about Johnny Cash's Hurt, Whitney Houston's I Will Always Love You, or Aretha Franklin's Respect? No standalone cover song articles on these allowed in the largest and most expansive information compendium the world has ever seen? What that is saying is that not one of those or thousands of other cover songs meets both WP:N and WP:NOT. Does that make any sense to anyone or is it meant to play a joke on (someone)? Perhaps there is no Wikipedia policy that prohibits anyone from posting Hound Dog (Elvis Presley cover song) or thousands of other cover songs as standalone articles and this instead is the sixteenth most impactful and notable hoax to have existed on Wikipedia. -- Uzma Gamal (talk) 12:39, 13 February 2013 (UTC)[reply]

And what the hell does this have to do with the article? Hot Stop (Talk) 13:45, 13 February 2013 (UTC)[reply]

"Hell"? Read the article and then read my post again. You'll see it. -- Uzma Gamal (talk) 13:48, 13 February 2013 (UTC)[reply]

You seem to be completely confused. That article has never existed, so it can't be undeleted, which is what DRV is for. If you think it should be created, create it, just make sure you source it. The wikiproject can't stop articles being created. All it could do is nominate it for deletion, then it's up to the community. And it has nothing to do with hoaxes. Ged UK 13:58, 13 February 2013 (UTC)[reply]

My above post has more in it than what you focus on in your reply. There are no standalone cover song articles in Wikipedia. Where did they all go? Further on the above, what is it that is discouraging Wikipedia writers from stepping forward to write standalone articles about some of the most popular songs of all time? Perhaps the elimination of all standalone cover song articles from Wikipedia is the sixteenth hoax to the above listed fifteen most impactful and notable hoax to have existed on Wikipedia. -- Uzma Gamal (talk) 14:14, 13 February 2013 (UTC)[reply]

Nice history overview. Hoaxes have always been around: "Some may be pleased to find that the national history begins with a hoax: the chronologically earliest ‘figure’ is Piltdown Man" [1]; but frighteningly easy? Rather than erecting higher hurdles to contribute, the system needs improvement to turn energy to productive product, such as reCAPTCHA. The wrong lesson was learned from Seigenthaler, it's not that an anonymous editor created the bio, but that anonymous attempts at corrections were reverted: the system lacks the discernment between error correction and vandalism. A hoax project is needed to flag and correct hoaxes before DYK. It shouldn't take press coverage to get factual errors corrected. Farmbrough's revenge⇔ †@1₭ 14:52, 13 February 2013 (UTC)[reply]
- The difference between liar and a friend, is that a friend stays with you forever. So a registered user with more than 10,000 edits is more likely a Wikipedia friend than an anonymous IP. We need seniors, people should stay in the club and not quit with 25 years. --Chris.urs-o (talk) 16:47, 13 February 2013 (UTC)[reply]

Very nice and well-written article! While you focus on the hoaxes, I am more thankful for your characterization of Wikipedia-editor-evolution: That as the encyclopedia has grown, much of the effort has shifted from "horizontal to vertical" editing, and requires editors with more specialized knowledge. That's a real good paragraph that should be shown to all those reporters who indicate the declining number of editors means Wikipedia is in "trouble." -- kosboot (talk) 17:30, 13 February 2013 (UTC)[reply]

The addendum which provides a "Top 15" list of hoaxes adds information and interest to the story, but it also adds notoriety and excitement to hoaxers. I think future articles or op-eds should consider not naming hoaxes. Linking to an existing list serves the same purpose and doesn’t provide as much of an incentive to future hoaxers. SchreiberBike (talk) 19:05, 13 February 2013 (UTC)[reply]

"This is not a problem that can be easily solved; but nor is it one that should be, as it is now, easily ignored." ResMar 23:30, 13 February 2013 (UTC)[reply]

I think there are a lot of good points here, and I understand the appeal of pending changes &c but I think that it (and other forms of page-protection, more generally) is not a helpful response to the hoax problem. This is because restricting who can edit an article is *very* hard to square with our principle that anybody can edit articles, so we only apply protection in cases where there's a known problem or there's a good reason to expect imminent abuse. These are the articles which already have many eyes on them; these are the articles where it's hardest to hoax.

Just as with other problems of accuracy and neutrality, a hoax's best chance of survival is in a quiet backwater where there are fewer other people looking; where nobody suspects a problem. I can't imagine the community agreeing to apply PC, or other protection tools, across millions of obscure, low-traffic, maybe-unwatched articles which haven't yet been flagged up on any noticeboard. bobrayner (talk) 11:55, 14 February 2013 (UTC)[reply]

Hoax is just a form of vandalism, vandalism in the broad sense is the problem. You lose motivation and enthusiasm. It wears you down. Schoolboy vandalism, hoaxes, spam, business promotion, "political correctness" (lies), harasment, edit warring and "paid" editing (in the broad sense; you do me a favour, than I do you a favour) are all a problem. --Chris.urs-o (talk) 12:10, 14 February 2013 (UTC)[reply]

Mind you that we are in a "mission impossible". All friends of Wikipedia want its quality improving. If we are losing editors, if the vandalism in the broad sense is getting worse, if there is no pending changes for main space edits from school IPs (at least), if there is an absolute prohibition for advertising, if you are able to access the toolserver between 24:00 and 12:00 Eastern Time Zone, only; when we are heading for disaster, probably. --Chris.urs-o (talk) 08:03, 15 February 2013 (UTC)[reply]

Nice to read, thanks. I highlight a somewhat lateral issue, but it is the first time I see it prominentely written out (though I would not be surprised at all it has been mentioned quite a few times before). I quote: «The project has reached its saturation limit—put another way, there simply aren't enough new people out there with both the will and the smarts to sustain growth». Seems many of the efforts for attracting and keeping editors simply forget that, expected continuous indefinite exponential growth (as in the mid 2000's). Nabla (talk) 03:26, 17 February 2013 (UTC)[reply]
- The project isn't saturated. Fighting vandalism get's you ruder (Wikipedia:Civility) against all editors. And vandalism in the broad sense wears down all Wikipedians. --Chris.urs-o (talk) 08:16, 17 February 2013 (UTC)[reply]

I made the "saturation limit" argument in another column here a while ago, it's linked in one of the citations. ResMar 22:16, 17 February 2013 (UTC)[reply]

@Chris.urs-o, I *think* it is not saturated as in "it can't grow anymore", because it may expand, it may reach new 'markets' (geographically and in type-of-editor). But is silly (though strangely common) to assume that it may grow exponentially, or even linearly, for ever. @Resident_Mario, thanks, I missed that (not so active then) I'll read it soon. - Nabla (talk) 00:39, 21 February 2013 (UTC)[reply]

I can't say as I understand the logic of applying the concept of hoax in relation to not having stand-alone articles on cover songs, but the premises is utterly incorrect. The following are all stand-alone articles on cover songs that are from the first 30 results of this Google search, that returned about 30,400 results:

--Fuhghettaboutit (talk) 02:15, 19 February 2013 (UTC)[reply]

I don't understand the relevancy of cover songs in an article about Wikipedia hoaxes. -- kosboot (talk) 13:05, 19 February 2013 (UTC)[reply]

Nor I.--Fuhghettaboutit (talk) 13:11, 19 February 2013 (UTC)[reply]

Fuhghettaboutit's comment is in response to Uzma Gamal's above. – Minh Nguyễn (talk, contribs) 06:36, 20 February 2013 (UTC)[reply]

The Signpost is looking for new talent.

Home

About