Knowledge or unreality?

Book review

Knowledge or unreality?

Jemielniak's Common Knowledge?: An Ethnography of Wikipedia

Jemielniak's book is an academic discussion of Wikipedia; he does not aim to present either a "how-to" guide for editors and readers or a complete history of the project. He states that his "book is a result of long-term, reflexive participative ethnographic research" performed as a "native anthropologist." (p. 193) (The word "ethnographic" in this context refers not to ethnicity in the quasi-racial sense, but to the study of a subgroup of the population—here, the subgroup that actively edits Wikipedia.) By this, Jemielniak means that he has spent several years as a Wikipedian, has introspected about his experiences throughout that time through the lens of his academic background, and has now written up his findings and conclusions. I don't think he means that he became active in Wikipedia for the purpose of doing research about it, although it seems quite possible that he started thinking about combining his editing hobby and his professional interests fairly early in his wiki-career.

I cannot pretend to evaluate Common Knowledge as a work of anthropology or of organizational management science. As a general reader and a Wikipedian, I found the book interesting as a compilation of incidents in Wikipedia's history, some of which I was already familiar with and some of which were new to me, and as a reminder of some issues the project faces as it moves forward. Non-academic readers may find the book lacking in a unifying theme, beyond that Wikipedia plays an important role in the world today that warrants academic study of its culture and communities. Jemielniak recently stated (on a Wikipediocracy thread) that "I wrote this book for academic research purposes, I absolutely have no hope of high sales (and honestly, I'll be surprised if it goes beyond 500 copies)." The book has been praised by Jimmy Wales, Clay Shirky, Jonathan Zittrain, and Zygmunt Bauman and it deserves to sell well over 500 copies, but it won't make be making the wiki-best-seller list either.

The eight chapters of Common Knowledge discuss basic rules governing Wikipedia, different roles contributors take on within the project, dispute resolution processes, and the nature of project leadership. The topics are illustrated with examples of disputes or controversies drawn primarily from English Wikipedia history (though controversies about actions by Jimbo Wales on Wikimedia Commons and Wikiversity are also mentioned). The incidents Jemielniak discusses are presented in detail and accurately, but some of them are ten years old and don't necessarily reflect the project's practices or realities today. For example, Jemielniak reviews the bitter and protracted disagreement on En-WP regarding when the historical German-language name "Danzig" should be used for the city now located in Poland and known as Gdańsk. Perhaps aided by his own geographical and historical background, he does an excellent job of presenting the history of the dispute, surveying the arguments for the different points of view, and explaining why the dispute-resolution process ultimately reached the result it did. He does not, however, discuss whether the Wikipedia of 2014 would address the same issue, if it were arising anew, in the same fashion that the much younger Wikipedia of 2003-2004 did.

Jemielniak also doesn't spend much time discussing how lessons learned from Wikipedia dispute-resolution experiences can be used to minimize future disputes or to improve future decision-making. I find this unfortunate, but I can't call it a fault of the book, both because ethnography is descriptive rather than prescriptive, and more importantly because the failure to take stock of dispute-resolution successes and failures has struck me for years as a project-wide myopia. In the 13½ years of English Wikipedia there have been, in round numbers, a billion edit-wars, yet no one knows whether most edit-wars get resolved by civil discussion reaching a consensus on the optimal wording, or by one side's giving up and wandering away (or sometimes by everyone's ultimately losing interest and wandering away). Similarly, the English Wikipedia Arbitration Committee has decided several hundred cases since 2004, and community discussions on noticeboards have resolved thousands more content and conduct disputes, yet no one ever seems to have gone back and conducted any systematic review of which approaches to dispute-resolution worked better than others. That's a different book that ought to be written, although it too risks selling fewer than 500 copies.

Speaking of ArbCom (which I'm prone to do since I've served on ours since 2008), Jemielniak mentions the Arbitration Committees of both the Polish Wikipedia and the English Wikipedia. He opens the book with an account of a Polish Wikipedia arbitration case that resulted in his being blocked from Po-WP for one day. He claims that in retrospect he accepts the ruling against him, but his account of the dispute makes that ruling sound terribly unfair—a cynical gesture of evenhandedness, but meted out to editors who didn't deserve to be treated evenhandedly. (But of course those of us who can't read Polish will never hear the other side of the story.)

The book's mentions of En-WP ArbCom are sound, but dated. He discusses the historical origin of the Committee as an extension of the original authority of Jimmy Wales, and cites a handful of Committee decisions, the most recent of which is an unusual case-motion from 2009. He does not spend much time on the current role of the Committee. That's actually a very defensible omission, because at least on English Wikipedia (I can't speak for other projects), while ArbCom has other responsibilities (some of which most of us don't particularly want), the importance of the Arbitration Committee as an arbitration committee has radically declined in the past few years. (I've discussed this decline here.) So Jemielniak's not spending nearly as much space discussing arbitration as one might expect in a book about Wikipedia hierarchies, leadership, and dispute resolution turns out to be a reasonable decision, but one that is not explained.

Although the academic style of Common Knowledge (and the price of the book) will deter some readers, Wikipedians who want a taste of Jemielniak's thinking about the project can find it in a recent article he contributed to Slate, "The Unbearable Bureaucracy of Wikipedia". In this article, aimed at a general rather than an academic audience, Jemielniak posits that Wikipedia's "increasingly legalistic atmosphere is making it impossible to attract and keep the new editors the site needs." It's a thoughtful article that identifies a significant issue, and its more direct approach accompanied by concrete suggestions make this article more accessible than Common Knowledge for non-specialist readers. All of us who want Wikipedia to thrive, which requires that the project welcome newcomers and facilitate their becoming regular editors, can hope for more such wisdom from this Pundit.

Seife's Virtual Unreality: Just Because the Internet Told You, How Do You Know It's True?

By contrast to Jemielniak's academic treatment specific to Wikipedia, Charles Seife—the author of Zero, Alpha and Omega, and Proofiness—has written a more broadly themed book about the unreliability of information found throughout the Internet. "Just because the Internet told you," the subtitle asks, "how do you know it's true?" Now at one level, the fact that the Internet contains a fair amount of misinformation is not breaking news; "Someone is wrong on the internet" became a meme and then a cliché for a reason. Lots of us think we're sophisticated enough to avoid falling into the kinds of traps that Seife warns us about—but the warnings in Seife's book are important and timely nevertheless.

Wikipedia is just one of the many online sources of bad information that Seife discusses, but for obvious reasons it's the one I'll focus on here. Seife catalogs a dozen instances in which deliberate misinformation was introduced into Wikipedia. Such misinformation is inserted into Wikipedia, perhaps every day, by a miscellaneous array of pranksters, hoaxers, vandals, defamers, and in a few instances by Wikipedia critics conducting so-called "breaching experiments" to see how long a falsehood placed in Wikipedia stays in Wikipedia. (Such experiments are not permitted; see also Wikipedia:Do not create hoaxes.) Some of Seife's examples will be well-known to "Signpost" readers, such as the Colbert-inspired tripling of elephants and the Bicholim Conflict; others were new to me, such as AC Omonia Nicosia and the Edward Owens hoax.

Book review

UK political editing; hoaxes; net neutrality
22 April 2015

Saving Wikipedia; Internet regulation; Thoreau quote hoax
15 April 2015

WikiWomen's History Month—meetups, blog posts, and "Inspire" grant-making campaign
11 March 2015

Gamergate; a Wiki hoax; Kanye West
11 March 2015

Monkey selfie, net neutrality, and hoaxes
13 August 2014

How many more hoaxes will Wikipedia find?
30 July 2014

Wikipedia's sexism; Yuri Gadyukin hoax
29 April 2013

An article is a construct – hoaxes and Wikipedia
11 February 2013

Hoaxes draw media attention; Sue Gardner's op-ed; Women of Wikipedia
28 January 2013

Rush Limbaugh falls for Wikipedia hoax, Public Policy Initiative, Nature cites Wikipedia
20 September 2010

Hoaxes in France and at university, Wikipedia used in Indian court, Is Wikipedia a cult?, and more
14 June 2010

Quote hoax replicated in traditional media, and more
11 May 2009

News and notes: Flagged Revisions and permissions proposals, hoax, milestones
10 January 2009

Media coverage of Wikipedia hoax results in article
17 April 2006

Hoax exposé prompts attempt to delete author
8 August 2005

Hoax articles on April Fool's rub some the wrong way
4 April 2005

Attempt to foist false article on Wikipedia revealed
14 February 2005

More articles

Experienced Wikipedians are well-aware of this problem, as are our critics. English Wikipedia, in what can equally be considered admirable self-criticism or self-absorbed navel-gazing, contains discussions of hoaxes on Wikipedia; we also have a lengthy List of hoaxes on Wikipedia; and another compilation recently appeared on a critic site here. (Wikipediocracy link)

Misinformation in the media has always been with us (Tom Burnham's books were favorites of mine growing up, and I'm mildly dismayed that Burnham's name comes up a redlink), but it certainly is possible to spread false information more rapidly online than it was in the analog era. Of course, it is possible to spread correct information more rapidly as well. A particular problem is misinformation posted on Wikipedia—and elsewhere all over the Internet—with the purpose of doing harm to someone. (A prime example of this sort of thing is the Qworty fiasco that unfolded last year.) Any falsehoods in article content damage the credibility and usefulness of the encyclopedia we are collaboratively writing, but intentional falsehoods posted by a subject's personal or political or ideological enemies with the malicious intent to defame or damage a living person do so tenfold. I am confident that well over 99% of Wikipedia pages are free of intentional falsehoods—yet no one can deny that Wikipedia articles must still contain far too many lies, damn lies, and sadistics.

Neither Seife nor Jemielniak say much about the biographies of living persons policy and its enforcement, although many Wikipedians, including myself, have long thought fair treatment of our article subjects to be the central ethical issue affecting the project. I know that when I've been defamed online I didn't enjoy it, and that Wikipedia BLP subjects feel the same way when their number-one Google hit has been edited in nasty ways by their personal or political or ideological enemies. (The good news is that when I or others spot defamation on Wikipedia we are often able to do something about it; I've often wished that I had an "edit" and a "delete" button that I could use on the rest of the Internet.)

Seife's discussion of misinformation on Wikipedia focuses on intentionally false information, but a greater number of inaccuracies are introduced by editors who make honest mistakes than by hoaxers and vandals. Sometimes mistakes are made by good editors who inadvertently type the wrong word or misread a source. Other times, we encounter a good-faith editor who wants to help build Wikipedia but, at least in a given topic-area, simply doesn't know what he or she is talking about. Wikipedia has no systematic system of quality control beyond surmounting the bar for deletion, at least until one seeks to bring an article to the mainpage or have it rated (at which point various sorts of flyspecking take place—some of which can be overdone, but that's another discussion). On English Wikipedia today, there are dedicated noticeboards to address conflict-of-interest issues, evaluate the reliability of sources, solve copyright problems (some quite abstruse), keep fringe theories in check, and put a stop to edit-warring. I've never seen anyone wonder why there's no dedicated noticeboard where one goes for help in figuring out whether questionable information in an article is accurate or not.

Despite the falsehoods he identifies, all of which have now been removed, Seife acknowledges that "by some measures one can argue that Wikipedia is roughly as accurate as its paper-and-ink competitors." (p. 29) He cites the well-known 2005 Nature article comparing the accuracy of Wikipedia's scientific content to that of a canonical, traditional reference source, the Encyclopedia Britannica. One continues to read of comparisons of Wikipedia with traditional library reference books (see Reliability of Wikipedia). The Wikipedia community should certainly aspire for our encyclopedia to land on the favorable side of such comparisons. I think that on balance it does.

But "Wikipedia vs. Britannica" is no longer the right question, or at least not the only right question. At least equally relevant today is how Wikipedia's completeness and fairness and accuracy compare, not only to traditional media sources, but to the other information available on the Internet. Wikipedia has evolved as part of, not independent of, the Internet as a whole. And it is the Internet as a whole, not just Wikipedia, that has changed the population's information-searching habits, so that today when one needs or wants to look something up, one does so on the computer or a handheld device rather than in a book or a (hard-copy) journal or newspaper. In the unlikely event that Wikipedia (and all of its mirrors and derivatives) were to disappear tomorrow (and not be replaced by a similar site), our readers from schoolchildren to senior citizens would not revert to the habits of 25 years ago and start trooping to the library or even the reference shelves in their living rooms when they wanted to check a fact. (I am not saying this is a good thing or a bad thing, though it has elements of both; it is simply a truth.)

Instead, people in the wikiless world would still perform the same Google searches that today bring up their subject's Wikipedia article as a top-ranking hit. They would find the same results, minus Wikipedia, and they would look at the other top-ranking hits on their subject instead. Would those pages, on average, provide better-written, better-sourced, more accurate, and more fair coverage of their subject than the corresponding Wikipedia pages? And to the extent the answer is yes, how do we link the best of that content to become accessible from Wikipedia? A future Wikipedia scholar may wish to focus more on these questions (and produce another 495-copy-selling book).

Seife rather kindly refrains from discussing in the book, as an example of a questionable Wikipedia page, his own BLP. Predictably, that page is the first Google hit on Seife's name (his own webpage at NYU is second). Unfortunately, the article bears a prominent, disfiguring banner at the top of the page, proclaiming that:

This article may require cleanup to meet Wikipedia's quality standards. The specific problem is: Article does not meet Wikipedia standards for quality. Please help improve this article if you can. (June 2013)

Now, no well-informed reader of Wikipedia would take this pronouncement alleging that Charles Seife is an ill-written article as a reflection against Charles Seife. (If anything, the obvious circular reasoning suggests sloppiness in the crafting of the tag.) After all, the reader would know that Charles Seife wouldn't have written the article and, as a matter of our conflict-of-interest guidelines, is discouraged from editing the article at all, much less improving its overall editorial quality. Nonetheless, it isn't exactly encouraging that in the 13 months since an anonymous IP editor added that tag, no one has improved the article enough to resolve the quality concern and remove the tag. If I were notable enough to warrant a Wikipedia BLP and this were the state of it for over a year, I think I'd have the right to be ticked off. (Cynical aside to editors interested in Wikipedia's public relations: improve the BLPs of journalists likely to cover us.)

Meanwhile, in a recent radio interview—which is well worth listening to—Seife claims that Wikipedia gets four or five facts of his life wrong (not controversial claims, he says, just basic facts, though he doesn't name them), which knowing about the COI guideline he didn't fix. (Aside to Charles Seife: let me know about the non-controversial fixes needed and I'll make them myself. You won't need to go to The New Yorker à la Philip Roth.)

The bottom line on these two books: Wikipedians should read (and think carefully about) Jemielniak's Slate article, but only the hardier ones among us will gain the full benefit of his book, although all of us should thank him for writing it. More Wikipedians will enjoy Seife's book, though only a sliver of it is about Wikipedia, and perhaps everyone should listen to his radio interview, although for many of us both the book and interview will reinforce, rather than challenge, our existing views about the reliability of the information that surrounds us.

Ira Brad Matetsky is a New York attorney. He edits as Newyorkbrad on the Wikimedia projects, having first registered on the English Wikipedia in 2006. He has been a member of the English Wikipedia Arbitration Committee since 2008.

The views expressed in this book review are those of the author alone; responses and critical commentary are invited in the comments section. A previous review of the Polish translation of Jemielniak's book is archived here.

In this issue

30 July 2014 (all comments)

Book review

Recent research

News and notes

Wikimedia in education

Traffic report

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

"I've never seen anyone wonder why there's no dedicated noticeboard where one goes for help in figuring out whether questionable information in an article is accurate or not." That's what the article and it's talk page are for. If you find a questionable claim, just WP:CHALLENGE it. You can {{cn}} it, delete it, or post on the talk page. Paradoctor (talk) 17:20, 2 August 2014 (UTC)[reply]

Of course that's the answer to the question no one's asked. And that works well when an article has a good number of watchers. But hoaxes and bad information are more prevalent on less-watched pages, and a c/n tag or a talkpage hoax there can last unaddressed for months. Newyorkbrad (talk) 17:48, 2 August 2014 (UTC)[reply]

He who repeats The Word without The Tag shall be Made Fun Of. The same goes for not following up on threads you start. (I presume you meant "post" rather "hoax".) If there is any lesson to be gained, I'd say it is "Be bold in deleting stuff you find strange" as well as Wizard's First Rule.

Of course, Wikipedia being the pragmatical beast it is, we have to live with reality. IMO, in light of WP:WIP, the best we can do is enabling a souped-up version of the metadata gadget for all users. And maybe the WMF could fork over some grant money to homeless former Britannica employees for quality control-for-hire. Paradoctor (talk) 19:09, 2 August 2014 (UTC)[reply]

I thought the comment about the accuracy noticeboard (a Wikipedia 'fact check' perhaps) was a really insightful one. It seems like centralising this process could work well to increase visibility for low-visibility articles, for example, where the talk page might not always work? --hfordsa (talk) 15:39, 3 August 2014 (UTC)[reply]

Thanks for the reviews Newyorkbrad.---Pine^✉ 06:54, 3 August 2014 (UTC)[reply]

Well, from my experience I'd say practically all reference works have hoaxes, or errors so blatant to lead one to suspect they are hoaxes. Harvey Einbinder's The Myth of the Britannica opens with a chapter listing some of the more egregious errors known to have been found in that work that beg to be considered intentional hoaxes, then proceeds to point out the other flaws in the Encyclopedia Britannica. (Have a look at my review for the Signpost for more about Einbinder's book.) Another example might be the article on "Gremlins" in the Funk and Wagnalls Standard Dictionary of Folklore, Legend and Mythology -- although I'd be surprised if anyone mistook that as anything but a joke. And then there's the book I've been using to revise Eponymous archon & provide reliable sources for that article -- Alan E. Samuel's Greek and Roman Chronology, a carefully researched & written book by a tenured academic: over 2 or 3 consecutive pages of this book the word "calendar" is frequently misspelled. Or maybe it is just a sign that the reader has begun to master a subject when she/he starts to catch mistakes in the reliable sources used... -- llywrch (talk) 07:05, 3 August 2014 (UTC)[reply]

Taking a hint from software engineering, I'd say expecting stuff made by humans to be perfect is unrealistic. Transport the software figures to long mathematical proofs, and note that they, generally, are only checked by a handful of experts, rather than undergoing formal quality testing.

The question is not whether there are problems with our content, but is the amount of problems acceptable? Paradoctor (talk) 10:53, 3 August 2014 (UTC)[reply]

This brings up the old question of how to measure the accuracy of Wikipedia. Of course, it will always be a comparison of the accuracy of two sources, always an A vs. B. Wikipedia vs. Britannica, or perhaps Wikipedia medical articles vs. medical textbooks. Brad's text "how Wikipedia's completeness and fairness and accuracy compare, not only to traditional media sources, but to the other information available on the Internet," suggests to me that the most relevant comparison is Wikipedia vs. the rest of the internet. So for example, we could find say 300 journalists and assign each an article. They would then read the article and compare that to what they learned in an hour on the rest of the internet (TRotI). My guess in many subject areas WP will come out on top. Smallbones_(smalltalk) 23:25, 3 August 2014 (UTC)[reply]

That depends on what parts of the Internet these journalists are allowed to access. Based on my experience researching various topics, if they are limited to the parts where content is free (as in zero cost of access, & no registration needed) Wikipedia would clearly be the winner. If resources accessible thru the Internet -- such as Nexus-Lexus & JSTOR -- are included, the comparison would be much, much closer; resources like those will always provide better quality coverage of specific topics, although those specific topics are slowly decreasing in number. -- llywrch (talk) 15:28, 4 August 2014 (UTC)[reply]

Thanks for the informative reviews and commentary. I have a nit to pick, however, with the following statement:

"Instead, people in the wikiless world would still perform the same Google searches that today bring up their subject's Wikipedia article as a top-ranking hit. They would find the same results, minus Wikipedia, ... "

There is a significant omission form the second sentence: they would find the same results minus Wikipedia and its clones and adaptations and web pages which have mindlessly regurgitated its content. Surprisingly—to me at least—this doesn't seem to have much effect on searches on terms referring to broad general subjects. For searches on the terms "Leibniz", "Vera Lynn" and "mind-body problem", to take three I just threw up off the top of my head, the only obviously Wikipedia-influenced results in the first two pages from Google are Wikipedia itself and Google's own knowledge graph.

The results are altogether different, however, if you do a search on terms designed to find sources for dubious factoids which Wikipedia has got wrong. A Google search on the expression "cadamekela | durkeamynarda", for instance, returns 18 pages of results which, apart from one or two now flagging these as a Wikipedia hoax, have simply reproduced Wikipedia text verbatim, or regurgitated it with some form of paraphrase.

This effect is not limited to hoax material, however. More concerning to me is Wikipedia's power to increase enormously the web impact of cranks. The first page returned by a Google search on the expression ""Jafar al-Sadiq" heliocentric" currently contains links to three web pages reproducing some version of the absurd fiction that an 8th-century Islamic scholar, Ja'far al-Sadiq, had proposed a heliocentric model of the solar system. Web pages peddling this nonsense had certainly already existed before the notorious Jagged 85 added it to Wikipedia's article Heliocentrism, but one result of that addition was a rapid massive increase in the number of such pages.

David Wilson (talk · cont) 05:20, 6 August 2014 (UTC)[reply]

(@User:David J Wilson) This point is quite correct and extremely important; it is a point I have made on-wiki several times before, and which I didn't stress in this book review only because the review had become quite too long already. Errors, questionable assertions, and unfair characterizations contained in Wikipedia articles almost immediately propagate all over the Internet, and may remain for years on Wikipedia-based mirror and derivative sites for quite awhile even after an error is fixed on Wikipedia itself. For the hypothetical "Wikipedia versus" search, you are right that all these sites would need to be assumed away as well. Conversely, for the real-world, this adds to my view that in prioritizing our goals for Wikipedia, accuracy and BLP compliance need to be consistently emphasized. Thanks for your input (and thanks to everyone else who has posted here as well). Newyorkbrad (talk) 22:25, 7 August 2014 (UTC)[reply]

I'd clarify the introduction of the concept of ethnography a bit further to say that its a qualitative research tradition of embedding oneself in a culture (say that of WP editors) so as to learn how their culture works and, ultimately, to write about it. That would explain his position or intent a little better for new readers. Usually it's interesting to hear to what extent he embedded himself (did he interview or mainly use historical talk page data? did he have any offline interaction? what kind of consent did he get for his data collection?), and though it was covered, a bit more about why he did it and what he found (ethnographers constantly need to "gain trust"—how did he view that as he took on more permissions within WP?) One of the other common issues in this type of research is relationship with the participants—did he run any of his conclusions past his participants so as to have a discussion about its accuracy? Anyway, some thoughts czar ♔ 14:36, 22 November 2014 (UTC)[reply]

It's your Signpost. You can help us.

Home

About