The Signpost
Single-page Edition
WP:POST/1
29 January 2014

Traffic report
Six strikes out
WikiProject report
Special report: Contesting contests
News and notes
Wiki-PR defends itself, condemns Wikipedia's actions
Arbitration report
Kafziel case closed; Kww admonished by motion
Recent research
Translation assignments, weasel words, and Wikipedia's content in its later years
 

2014-01-29

Six strikes out

Contribute  —  
Share this
By Serendipodous

Summary: There are times when this job is hard. As an analogy, imagine navigating in fog at night, except you don't know where you are, you don't know where you want to go, and your flashlight keeps dying on you. Wikipedia, in the understandable desire to protect users' privacy, has left me with precious few tools to find my way (Bounce rate and HTTP referers would be nice) and so there are times when it is impossible to determine why something is or is not on the list. The hour-by-hour viewing tool I made such a fuss about two weeks ago, and which would at least have suggested which spikes were natural, is currently down; so I'm back to erring on the side of exclusion. Although only two articles were removed from the top 10, six articles—roughly a quarter—have been removed from the top 25.

I'm asking: does anyone know of a way to track down these occasional one-day spikes if they don't appear on Reddit or a Google Doodle? And why is important information like view counts outsourced to volunteer servers liable to crash or lose functionality?

For the full top 25 report, plus exclusions, see WP:TOP25

For the week of 19–25 January, the 10 most popular articles on Wikipedia, as determined from the report of the 5,000 most viewed pages* were:


Rank Article Class Views Image Notes
1 Jordan Belfort C-class 799,325 Onetime stockbroker who spent 22 months in prison for running a penny stock boiler room, he went on to write the books that the film The Wolf of Wall Street is based on.
2 Juan Mata C-Class 647,317
Spanish footballer who was transferred this week from Chelsea F.C. to Manchester United for a club record sum of £37.1 million ($61.4 million)
3 Richard Sherman (American football) Start class 638,607
This guy arguably came top of the list of articles related to Super Bowl XLVIII due to his combative talking style, which got him some bad press after taunting Colin Kaepernick (see below) after beating the San Francisco 49ers to reach the Super Bowl.
4 Martin Luther King, Jr. Good Article 607,434
With his birthday a federal holiday, it's not surprising that he makes an annual appearance on this list.
5 The Wolf of Wall Street (2013 film) C-Class 587,561 Martin Scorsese's acclaimed account of one person's contribution to our general economic misery opened to a respectable $34 million on Christmas Day, and has now made over $220 million worldwide
6 Justin Bieber B-Class 554,032
Why is he on this list? Could it be his various indiscretions in Latin America? The lawsuit he was saddled with after egging a neighbour's house? Or, perhaps, his arrest after drag racing a Lamborghini drunk on a beach in Florida? Truth be told it's probably that.
7 Facebook B-class 513,840
A perennially popular article
8 Sherlock (TV series) Good Article 434,520
The contemporary-set revamp of the Sherlock Holmes mythos has become a surprise global hit (and turned its star, Benedict Cumberbatch, into an international sex symbol) and is now watched in 200 countries and territories (out of 254), so it's not surprising that its much ballyhooed return from a two-year hiatus was met with feverish anticipation.
9 Frozen (2013 film) C-class 405,400 Disney's de facto sequel to Tangled has become something of a sensation. It reclaimed the top spot in the US charts on its sixth weekend (a feat only matched by Avatar and Titanic) and has already outgrossed its predecessor both domestically and worldwide, with a total of nearly $820 million. It won a Golden Globe for Animated Feature and seems a shoo-in for the Oscar.
10 Deaths in 2014 List 397,831
The list of deaths in the current year is always quite a popular article.


Reader comments

2014-01-29

Special report: Contesting contests

According to the Wikimedia Foundation's evaluation of on-wiki contests, "... contests are ways for experienced Wikipedians to come together to work together to improve the quality and quantity of Wikipedia articles." Contests have existed almost as long as the English Wikipedia. Contestants have expanded hundreds of articles and made tens of thousands of edits. Although it may seem as though there aren't any negatives to contests, they have occasionally become a divisive topic on the English Wikipedia.

So, what's not to like about contests? Well, many contrasting opinions have been flowing around Wikipedia and Wikimedia about problems with contests. Much conversation has materialized about the quality of the edits during contests and whether receiving prizes for winning is paid editing. It seems as though everyone has an opinion, and everyone's is different. Wizardman puts it nicely: "It's a double-edged sword. On the one hand it gets people editing more than they would otherwise, perhaps in areas they otherwise wouldn't touch, but they are very ripe for abuse even if said abuse is planned against."

Some say that contests are very helpful: They provide friendly competition to improve the encyclopedia. Others say that contests cause much unneeded stress, and that Wikipedia should be a relaxing place. However, stress and competitiveness are only a small part of the discussion.

The number of users participating in the WikiCup, set against the year

Current contests

Backlog drives

Although much talk has blasted around Wikipedia and Wikimedia these past few weeks about paid editing, not much focus has been placed on whether or not contest prizes are paid editing. However, discussion on this topic will likely become more heated in the future.

As one user put it, "Contests are a form of declared paid editing, if, indeed, there is a cash reward. However, I would think that the sort of edits contests inspire would've been done anyway, which [is] how it differs from the sort of paid editing that's controversial—the kind that Wiki-PR does/did."*

An argument for rewards in contests is that prizes are just an incentive to work harder: users help out the encyclopedia and are rewarded for it. Even if you participate in the contest, a guarantee is certainly not placed that you'll win, so participating arguably isn't paid editing. You could be editing and helping out Wikipedia in an attempt to win the contest, but end up not winning.

On the other hand, one user puts it bluntly; when asked if they thought contests are paid editing, they said yes. When asked for more information, they responded: "Because it's a payment for editing?". This is a popular opinion; some of the people I've asked say getting prizes for winning contests is definitely paid editing. This opinion states that even if the reward is declared and public, the contestant is still being paid to contribute.

This is a heavily debated topic, and future discussion will without question take place. We may or may not see changes in how the system works for contests.

Quality or quantity?

As Wizardman puts it, "the second some people see the word contest they push as much as they can as fast as they can, which can violate the spirit of a contest if they forget to make their edits actual improvements."

An example of this is in the Stub Contest: the goal of the Stub Contest was to reduce the number of stub-class articles on Wikipedia. However, the contest did not only focus on expanding stubs: it also involved re-rating articles listed as stubs that deserved to be rated as start-class or higher. Overall, 48,830 articles were re-rated from stub to start (or higher)-class during the contest.[1]

One of the contestants in the Stub Contest, Sven Manguard, had a complaint with the contest. As mentioned above, one of the methods to score points in the contests was to re-rate stub-class articles to start-class. Sven noticed that many people were re-rating stubs to starts even though the article did not deserve to be re-rated, thus hurting the project and scoring points unfairly.[2] Sven also touched on a very important point that is often forgotten during contests: "The point of this contest is to improve the quality of Wikipedia's weakest articles, and I fear that a dash for cash has obfuscated that goal for some people."

Conclusion

Like most debatable topics on Wikipedia, everyone has a different opinion on contests. Everyone goes their own ways: some choose to participate, some don't. In the future we may see adjustments to contests in regards to prizes, due to the argument that prizes are paid editing. In the end, almost everyone agrees on one thing. As one user puts it, "Prizes are just incentives, but every contest has a prize - improving Wikipedia".

You can participate in most contests by entering your username onto the 'Entry' page of the contest. If you're looking to join a contest that hasn't yet started, look at the Tyop Contest, which starts at the beginning of February, or the Core Contest, which begins on 10 February. The WikiCup is currently in progress, but signups close soon.

What do you think? Are contests paid editing? Do you participate in contests? Why or why not? Let your voice be heard in our comments section!

Next week, we'll head to Sochi. Until then, rediscover our previous adventures in the archive.

Food for thought:

Here are some thought-provoking quotations from Wikipedia editors:*

Notes

* denotes anonymous comments from Wikipedians obtained through an online survey form.

  1. ^ List of article class ratings
  2. ^ To quote exactly, Sven Manguard stated: "I'm seeing a lot of people upgrading a lot of articles based purely on being 1501 readable prose characters or higher, with no consideration to the quality of the prose or even the suitability of the article."


Reader comments

2014-01-29

Wiki-PR defends itself, condemns Wikipedia's actions

Wiki-PR, a public relations agency whose employees used a sophisticated array of concealed user accounts to create, edit, and maintain several thousand Wikipedia articles for paying clients, has told Business Insider that it was demonized by the online encyclopedia.

In an interview with the prominent business and technology news website, Jordan French, Wiki-PR's CEO, said he believes the Wikimedia Foundation "painted" his company to look like an "evil entity" that is "scrubbing truths from Wikipedia":

Related articles
Wiki-PR

Wiki-PR duo bulldoze a piñata store; Wifione arbitration case; French parliamentary plagiarism
1 April 2015

With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use
18 June 2014

WMF bites the bullet on affiliation and FDC funding, elevates Wikimedia user groups
12 February 2014

Wiki-PR defends itself, condemns Wikipedia's actions
29 January 2014

Foundation to Wiki-PR: cease and desist; Arbitration Committee elections starting
20 November 2013

The decline of Wikipedia; Sue Gardner releases statement on Wiki-PR; Australian minister relies on Wikipedia
23 October 2013

Vice on Wiki-PR's paid advocacy; Featured list elections begin
16 October 2013

Wiki-PR's extensive network of clandestine paid advocacy exposed
9 October 2013


More articles

In Wiki-PR's view, it is the victim of an egregious mistake: it did not break the WMF's terms of use, and Wikipedia "made a bunch of errors and confused us with someone else, largely", French told Business Insider. While French does not name who or what Wiki-PR was confused with, he was presumably referring to Mike Wood, the owner of professional writing service LegalMorning.com and User:Morning277, the first Wikipedia account implicated in the Wiki-PR scandal. Instead, French maintains that Wiki-PR provides a valuable service by protecting the Foundation from "legally actionable libel".

Yet many of French's new claims appear to be in conflict with the evidence. At least three questions are raised:

Were the allegations and community investigation all a mistake? The long-term abuse file shows that Wiki-PR used remote employees, IP address-hopping, and technical loopholes to maintain up to 12,000 English Wikipedia articles. The aftermath included a community ban for being "repeatedly unable or unwilling to adhere to [Wikipedia's] basic community standards." The Wikimedia Foundation's legal assessment of the allegations was strong enough to elicit a cease-and-desist order in November 2013.

Did Wiki-PR break the terms of use? The Signpost has gained access to an online document containing a list of steps for reforming the company's behavior, prepared privately by a Wikipedian and edited by French. French agreed to it on 18 November—just one day before the Foundation sent its cease-and-desist letter to Wiki-PR—by writing at the top of the document, from which the Signpost has redacted all but French's name: "Wiki-PR agrees to all of the terms laid out in this roadmap. We're working on implementing them. 11/18". The Signpost understands that this was an attempt by the Wikipedian to "provide suggestions for reform in line with community expectations", though the document includes a statement that "their completion does not ensure Wikipedia’s community acceptance", and that "nothing in this roadmap constitutes a binding agreement, contract, or guarantee."

Critically, the introduction that French had agreed to states: "Wiki-PR has seriously abused the Terms of Use (TOS) and community policies. In an attempt to redeem their conduct, Wiki-PR agrees to a comprehensive review of their practices and a detailed program of reform, in collaboration with members of the Wikipedia community." One item states: "Wiki-PR will prepare a detailed proposal for how it will manage and maintain a high standard expected from all employees. Employees will declare to Wiki-PR all of their Wikipedia accounts for monitoring. Employees will not be paid if a review of their conduct does not meet a high standard." To this, French added on 13 November: "Defining high standard: Contractors will be removed if conduct seriously breaches Wikipedia’s TOS or community policies."

The full text agreed to by French is reproduced here.

Does Wiki-PR protect the Foundation from being sued for libel? In general, as the Foundation only provides an interactive computer service, according to the US federal Communications Decency Act, Section 230 it cannot be held legally responsible in the US for defamatory content published on its sites: the responsibility lies with the individual who added the material. A recent German court's ruling on the matter was called a "legal victory" by the Foundation, though this has been disputed.

Furthermore, the number of articles Wiki-PR created from scratch belies the assertion that it was primarily combating libel. Seven examples of their article creations have been uploaded and are open for viewing. Sources in these new Wiki-PR articles typically include Yahoo! Voices and CNN iReport, which despite the well-known brand attachments can be published by anyone, with little to no moderation—or by the US website Vatalyst, which appears to have been offline for six months but was operated by Wiki-PR and similarly lacked editorial oversight. In many articles in which Wiki-PR was involved, these and similar sites gave the articles "references sections [that] always have a surfeit of citations, with the clients' press releases and web sites balanced by passing mentions in seemingly independent publications." French's claim in the interview that Wiki-PR has about 45 people directly conflicts with his earlier assertion to the Wall Street Journal that they have "hundreds" of editors on staff. Wiki-PR's site even includes solicitations that attempt to interest companies in Wiki-PR's article-creating experience. Such pages were lampooned in a 31 January Wikipediocracy blog post ("Extra Creamy Wikipedia – adventures in advertising").

Continuing suspicion

Wiki-PR's actions were sufficiently extensive that their online identities are still being discovered more than three months after the original revelations. Eleven additional accounts are now suspected to be editing on behalf of Wiki-PR; one, CitizenNeutral, was blocked as recently as 27 January. Before CitizenNeutral suddenly stopped editing at the end of September 2013—barely a week before the Daily Dot named Wiki-PR in an article titled "The battle to destroy Wikipedia's largest sockpuppet army"—the account had a contribution history that was characteristic of Wiki-PR employees.

Much of CitizenNeutral's early editing was filled with tagging articles for conflict of interest and puffery, which Wiki-PR commonly did prior to contacting the article's subject. A later focus was on recreating deleted articles, nearly all of which had been deleted for being authored by Wiki-PR. These 33 new articles were short, one-line stubs, with no relation to the previous iteration, which fits into Wiki-PR's typical practice. Vice's Martin Robbins profiled one Wiki-PR client in October 2013, detailing the experiences of academic Emad Rahim. His article was deleted over notability concerns. When a Wiki-PR employee recreated the page, "it contained only one sentence. Rather than apologizing, French told [the subject] he should raise his media profile, and connected [him] to Scarsdale Media, who offered 30 days of 'media relations efforts' for another $800." Rahim had already paid Wiki-PR $1500.

Nothing in this article should be construed as implying that Wiki-PR is continuing to break the Wikimedia Foundation's terms of use.
Tony1 and Kevin Gorman contributed writing and research for this story.

In brief

A meeting as part of the project "The boundaries of editing" on the German Wikipedia, supported by Wikimedia Germany. This was a precursor to the OBS study that was published a short time ago.
  • Paid editing study: The German Otto Brenner Foundation (OBS) has published a study by freelance journalist Marvin Oppong into covert operations of public relations agencies on Wikipedia. He is quoted in an OBS press release as saying "The longer I dealt with the subject of Wikipedia, the more I got the impression that PR is widely used in Wikipedia. There is a real market in it." Oppong finds that Wikipedia's internal structures have been unable to prevent the manipulation of the site by public relations agencies. However, his conclusions—and essentially the entire book—have been skewered by German-language Wikipedians. A PDF document of the publication is available for gratis, in German. The Signpost has written to OBS asking whether, and if so when, an English-language version will be published.
  • Core Contest: The English Wikipedia's Core Contest, which aims to kindle development of vital articles, is preparing to launch its fifth event. The competition will run from 10 February to 9 March; the winners will receive prizes courtesy of a Wikimedia UK microgrant.
  • GLAM interviews: Dorothy Howard's (OR drohowa) interview series with librarians in the New York area continued this week with Bob Kosovsky (kosboot).
  • New user groups: The Foundation's Affiliations Committee has recognized two new user groups: Wikimedia Brasil and the New England Wikimedians. Both groups hope to eventually attain chapter status.
  • Board minutes: The minutes for the Foundation's November 2013 Board of Trustees meeting have been published, nine weeks after the meeting. The agenda for the current Board meeting has been posted on Meta, and comments are being made on the Foundation Board's Meta noticeboard.
  • WikiConference USA: The first national Wikimedia conference in the United States will be held at the New York Law School from 30 May to 1 June. Jointly hosted by the New York and DC chapters, the official press release states that the conference will "concentrate on the future of Wikimedia and will include workshops, panels, and presentations on Wikimedia’s outreach to cultural institutions, community building, technology development, and Wikimedia's role in education."
  • Program evaluations: The Foundation has published the latest of its program evaluations, focusing on the worldwide photographic initiative Wiki Loves Monuments. The key points concerning WLM were: more money spent implementing a WLM event doesn't equate with higher participant counts or uploads; only eight in a thousand uploads from WLM in 2012 were rated as quality, valued and/or featured pictures, and about 17% of uploads are currently used on WMF sites; the survival rate of new users brought in by WLM is about 1.7%; it is unclear whether WLM successfully educates new participants about open knowledge and free licensing.
    The temporary logo of the Georgian Wikipedia, with the puzzle colors changed blue and yellow to resemble the Flag of Ukraine
  • Ukrainian developments: The Ukrainian Wikipedia has decided to block access to the site for a half hour each day to protest new laws being passed by the country's government, which the community says could cause users to "avoid editing and writing articles about living people for fear of criminal liability through slander, or would copy only official information that does not comply with a neutral point of view." Such a scenario might "create a situation where there will be thousands of articles about people with irrelevant or biased information". A screenshot of the accompanying protest banner can be seen on Kyivpost. Meanwhile, the Georgian Wikipedia community has changed its logo for one week to show support for the ongoing protests in Ukraine. The new logo, seen at right, has had its puzzle colors changed to blue and yellow, the colors of the flag of Ukraine. Wikimedia Ukraine has published a blog post in support of the change.
  • Signpost changes: The Signpost welcomes Kirill Lokshin, who will be the new editor of the arbitration report, and Gamaliel, who is joining our "In the media" team.

    Reader comments

2014-01-29

Kafziel case closed; Kww admonished by motion

Kafziel case closed

The Kafziel case has been closed, with Kafziel losing his administrator status as a result. The case originated from a request for arbitration filed in December 2013, in which Hasteur alleged that Kafziel had inappropriately deleted pending entries in the Articles for Creation backlog. Unusually, Kafziel chose not to contest the allegations during the arbitration proceeding and instead announced his retirement, writing that "there's nothing anyone here can say or do to make me apologize for anything I did, or agree to do anything differently, and there's nothing short of that that will please people like [Hasteur]". In a split vote, the Arbitration Committee found this action to be in violation of the administrator accountability policy:

While addressing concerns regarding his edits at Articles for Creation Kafziel acted in a hostile and indifferent manner. When concerns were brought before ArbCom, he declined to submit substantive evidence explaining his actions, a breach of administrator accountability.

and voted to strip Kafziel of his administrator status:

For conduct unbecoming an administrator by failing to respond appropriately, respectfully and civilly to good faith enquiries about his administrative actions, Kafziel is desysopped and may regain the tools via a request for adminship. The user may not seek advanced positions in an alternative account unless he links such account to his Kafziel account.

Hasteur, for his part, did not leave the proceeding unscathed, receiving an admonishment for his conduct:

For his battlefield mentality in areas relating to Articles for Creation, Hasteur is admonished.

Kww admonished

In a split ten-to-four vote, the Arbitration Committee adopted a summary motion admonishing administrator Kww for changing the protection level on the "Conventional PCI" article, which had been protected by the Wikimedia Foundation's Philippe Beaudette in July 2013 as an "office action":

Kww is admonished for knowingly modifying a clearly designated Wikimedia Foundation Office action, which he did in the absence of any emergency and without any form of consultation, and is warned that he is subject to summary desysopping if he does this again.

Kww had requested arbitration of the dispute between Philippe and himself on January 24, claiming that Philippe had "restored the protection level to an illegitimate level" and requesting that the Committee "[make] clear to Philippe that he must choose one of the permitted protection levels", but the Committee ultimately declined to do so:

Because the request for arbitration filed by Kww seeks review of Office actions, it is outside the purview of the Arbitration Committee and accordingly the request is declined.

New and ongoing cases

  • Austrian economics: The Austrian economics case was opened on January 25 after a unanimous vote to accept by nine arbitrators. The case originates from a request for arbitration filed by A Quest For Knowledge, who alleged that the existing community sanctions on "Austrian economics" and related articles had failed to curb the ongoing disputes there and asked that the Committee impose discretionary sanctions on the topic. Evidence in the case will be accepted through February 8.
  • Gun control: The Gun control case, which was opened on January 5, has entered the workshop phase. The case arose from a request for arbitration filed by Gaijin42, who alleged that disputes about including material about Nazi Germany in the "gun control" article had become intractable and required the Committee's intervention to resolve.

Other news

  • Rschen7754 promoted: The Arbitration Committee announced the promotion of Rschen7754 to full arbitration clerk. Rschen7754 had served as a trainee clerk since September 2013.

    Reader comments

2014-01-29

Translation assignments, weasel words, and Wikipedia's content in its later years

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

Translation students embrace Wikipedia assignments, but find user interface frustrating

An article, "Translating Wikipedia Articles: A Preliminary Report on Authentic Translation Projects in Formal Translator Training", [1] reports on the author's experiment with "a promising type of assignment in formal translator training which involves translating and publishing Wikipedia articles", in three courses with second- and third-year students at the Institute of English Studies, University of Warsaw.

It was "enthusiastically embraced by the trainees ... Practically all of the respondents [in a participant survey] concluded that the experience was either 'positive' (31 people, 56% of the respondents) or 'very positive' (23 people, 42% of the respondents)." And "more than 90% of the respondents (50 people) recommended that the exercise 'should definitely be kept [in future courses], maybe with some improvements,' and the remaining 5 people (9%) cautioned that improvements to the format were needed before it was used again. No-one recommended culling the exercise from the syllabus."

However, the author cautions that Polish–English translations required more instructor feedback and editing than translations from English into Polish (the students' native language). And "most people found the technological aspects of the assignment frustrating, with most students assessing them as either 'hard' (39%) or 'very hard' (16%) to complete. The technical skills involved not only coding and formatting using Wikipedia's idiosyncratic syntax, but the practical aspects of publication. [Asked] to identify areas requiring better assistance, the respondents predominantly focused on the need for better information on coding/formatting the article and on publishing the entry. Thirty-nine people (almost three-quarters of the respondents) found the publication criteria baffling enough to postulate that more assistance was needed. That is even more than the 36 people (68%) who had problems dealing with Wikipedia's admittedly idiosyncratic code."

In the researcher's observation, this contributed to the initially disappointing success rate: "Of the 59 respondents, only eight had their work accepted [after drafting it in a sandbox]. Seven people were asked to revise their entries to bring them into line with Wikipedia's publication guidelines but neglected to do so, and 36 did not even try to publish. Some of those people were still waiting for their feedback to get a green light, but this result can only be described as a big disappointment. ... After a resource pack on how to translate and publish a Wikipedia entry was distributed to a fresh batch of students in the following semester, the successful publication rate proved significantly higher." These English-language instructions are humorously written in the form of a game manual ("Your mission is to create a Polish translation of an English-language article and deliver it safely to the Free Encyclopaedia HQ officially known as 'Wikipedia'. Sounds easy? Think again. Wikipedia is defended by an army of Editors who guard its gates night and day to stop Lord Factoid and his minions from corrupting it with bad articles."). They are available on the author's website, together with a small list of the resulting articles (which is absent from the actual research paper).

The project was inspired by author Cory Doctorow's use of Wikipedia in a 2009 course – most likely the one listed here, although the paper fails to specify it. The absence of discussion of the Wikipedia policies, combined with the absence of any references to prior research from the field of Wikipedia in education, makes it almost certain that the author was unaware of Wikipedia policies and available support (Wikipedia Education Program, etc.).


Briefly

  • Why bots should be regarded as an integral part of Wikipedia's software platform: In a new paper titled "Bots, bespoke code, and the materiality of software platforms"[2] published in Information, Communication & Society, Stuart Geiger (User:Staeiou) presents a critical reflection on the common view of online communities as sovereign platforms governed by code, using Wikipedia as an example. He borrows the term "bespoke" to refer to code that affects the social dynamics of a community, but is designed and owned separately from the software platform (e.g. Wikipedia bots). Geiger mixes vignettes describing his personal experience running en:User:AfDStatBot with discussions of the related literature (including Lessig's famous "code is law") to advocate "examining online communities as both governed by stock and bespoke code, or else we will miss important characteristics of mediated interaction."
  • "Precise and efficient attribution of authorship of revisioned content": Using a graph-theoretic approach, Flöck and Acosta investigate[3] a new algorithm that can detect the author of a part of document that has been edited by many. They use a units-of-discourse model, to identify paragraphs, sentences and words, and their connections. The authors claim that this approach can identify an author with 95% precision, which is more than the current state-of-the art. Most intriguing is that to make this comparison they have created the first "gold standard", a hand-made benchmark of 240 Wikipedia pages and their complex authorship histories.
  • "Which news organizations influence Wikipedia?": This is the question asked in a blog post[4] by a post-doc researcher at Columbia University's Tow Center for digital journalism. Looking at the top 10 news stories of 2013 – an admittedly subjective set determined by the author – the organizations from which the citations come are analyzed. Leading the pack are the New York Times, Washington Post and CNN, but the author notes that the tail of the distribution is very long – 68% of citations are not produced by the top 10 organizations. Qualitative analysis discusses "the surprise for the news organizations that don’t make the top ten; CBS News, ABC News, FOX News [...] this top ten strikes as leaning left overall".
  • Weasels, hedges, and peacocks in Wikipedia articles. Some[who?] computational linguists find many[which?] Wikipedia articles to be a superlative[peacock prose] corpus for natural language processing applications. Weasel words, hedges, and peacock terms (like the ones in the previous sentence) are labelled by Wikipedia editors because they tend to make an article less objective. A recent study[5] leverages this work to understand general features of the way people use subjective language to increase uncertainty about the truth or authority of the statements they make. By examining a set of 200 Wikipedia articles that had been flagged for these terms, the researchers found 899 different keywords that were frequently used as peacock terms, weasel words, and hedges. A machine learning classifier that was trained on this set of key words was able to identify other (unlabeled) articles that were written in a subjective manner, with high accuracy. In the future, approaches like these could lead to better automated detection of inappropriately subjective or unsourced statements—not only in Wikipedia articles, but also news articles, scientific papers, product reviews, search results, and other scenarios where people need to be able to trust that the information they are reading is credible.
  • WikiSym/OpenSym call for submissions: The call for submissions (until April 20) to this year's WikiSym/OpenSym conference lists 15 research topics of interest in the Wikipedia research track. The conference has taken place annually since 2005; this year's instance will take place from August 27–29, 2014 in Berlin, Germany. As in preceding years, the organizers intend to apply for financial support from the Wikimedia Foundation, addressing the open access concerns voiced in previous years with a reference to a new policy of ACM, the publisher of the proceedings.
  • Gender imbalance in Wikipedia coverage of academics to be studied with 2-year NSF grant: Sociologists Hannah Brückner (New York University Abu Dhabi) and Julia Adams (Yale University) have received a two-year grant over US$132,000 from the National Science Foundation for a research project titled "Collaborative Research: Wikipedia and the Democratization of Academic Knowledge". As described in a press release this month, the project will study "the way gender bias affects the development of pages for American academics in the fields of computer science, history, and sociology, disciplines that vary in their gender composition. ... For instance, 80 percent of academics listed on the Wikipedia page American Sociologists are male, while in reality less than 60 percent of American sociologists are male." The researchers plan to create lists of academics in each field who satisfy the notability criteria for academics, and compare them with the actual coverage on Wikipedia.
  • Discussions about accessibility studied: A paper presented at last year's SIGCHI Conference on Human Factors in Computing Systems (CHI'13)[6] examines the English Wikipedia as one of two "case studies of two UGC communities with accessible content". Starting from uses of Template:AccessibilityDispute, and pages related to Wikipedia:WikiProject_Accessibility, the authors "identified 179 accessibility discussions involving 82 contributors" and coded them according to content and other aspects.
  • Wikipedia content "still growing substantially even in later years": A preprint[7] by two researchers from Stanford University and the London School of Economics analyzes the history of around 1500 pages in the English Wikipedia's Category:Roman Empire over eight years, providing descriptive statistics for 77,671 (non-bot) edits for articles in that category. The authors find that "content is still growing substantially even in later years. Less new pages are created over time, but at the page-level we see very little slow-down in activity." They identify a "key driver of content growth which is a spill-over effect of past edits on current editing activity" – that is, articles that have been edited more often in the past attract more editing activity in the future, even when controlling for factors such as the page's "inherent popularity", suggesting a causal relationship.
  • Discover "winning arguments" in article histories, and notify losing editors: Winning the best paper award at last year's European Semantic Web Conference (ESWC), three authors from the French research institute INRIA presented (video)[8] "a framework to support community managers in managing argumentative discussions on wiki-like platforms. In particular, our approach proposes to automatically detect the natural language arguments and the relations among them, i.e., support or challenges, and then to organize the detected arguments in bipolar argumentation frameworks." Specifically, they analyzed the revision history of the five most revised pages on the English Wikipedia at one point (e.g. George W. Bush), extracting sentences that were heavily edited over time while still describing the same event. To these "arguments" they apply a NLP technique known as textual entailment (basically, detecting whether the assertion of the new version of the sentence logically follows from the first version, or whether the first version was "attacked" by a subsequent editor by deleting or correcting some of the information). The paper focuses mostly on establishing and testing this methodology, without detailing the actual results derived from the five revision histories (i.e. which arguments actually won in those cases), but the authors promise that "this kind of representation helps community managers to understand the overall structure of the discussions and which are the winning arguments." Also, they point out that it should make it possible to "notify the users when their own arguments are attacked."

References

  1. ^ Piotr Szymczak: Translating Wikipedia Articles: A Preliminary Report on Authentic Translation Projects in Formal Translator Training. In: Acta Philologica 44 (Warszawa 2013) http://acta.neofilologia.uw.edu.pl/archiwum/acta44.pdf p.61ff
  2. ^ Geiger, R. Stuart. "Bots, bespoke code, and the materiality of software platforms". Information, Communication & Society: 1–15. doi:10.1080/1369118X.2013.873069. ISSN 1369-118X. Closed access icon, author's copy at http://stuartgeiger.com/bespoke-code-ics.pdf
  3. ^ Fabian Flöck, Maribel Acosta: WikiWho: Precise and Efficient Attribution of Authorship of Revisioned Content. http://www.aifb.kit.edu/web/Inproceedings3398
  4. ^ Fergus Pitt: Which News Organizations Influence Wikipedia? January 17, 2014, http://towcenter.org/blog/which-news-organizations-influence-wikipedia/
  5. ^ Vincze, Veronika: Weasels, Hedges and Peacocks: Discourse-level Uncertainty in Wikipedia Articles. http://www.aclweb.org/anthology/I/I13/I13-1044.pdf
  6. ^ Kuksenok, Katie; Brooks, Michael; Mankoff, Jennifer (2013). "Accessible Online Content Creation by End Users". Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI '13. New York City: ACM. pp. 59–68. doi:10.1145/2470654.2470664. ISBN 978-1-4503-1899-0.
  7. ^ Aleksi Aaltonen, Stephan Seiler: Cumulative Knowledge and Open Source Content Growth: The Case of Wikipedia http://faculty-gsb.stanford.edu/seiler/documents/wiki_dec2013_03.pdf
  8. ^ Elena Cabrio, Serena Villata, and Fabien Gandon: A Support Framework for Argumentative Discussions Management in the Web. http://eswc-conferences.org/sites/default/files/papers2013/cabrio.pdf


Reader comments
If articles have been updated, you may need to refresh the single-page edition.



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0