The Signpost
Single-page Edition
WP:POST/1
20 February 2023

News and notes
Terms of Use update, Steward elections, and Wikipedia back in Pakistan
In the media
Arbitrators open case after article alleges Wikipedia "intentionally distorts" Holocaust coverage
Disinformation report
The "largest con in corporate history"?
Essay
Machine-written articles: a new challenge for Wikipedia
Tips and tricks
All about writing at DYK
Featured content
Eden, lost.
Gallery
Love is in the air
Traffic report
Superbowl? Pfft. Give me some Bollywood! Yours sincerely, the world
From the archives
5, 10, and 15 years ago: Let's (not) delete the Main Page!
Cobwebs
Editorial: The loss of the moral high ground
Humour
The RfA Candidate's Song
 

2023-02-20

Terms of Use update, Steward elections, and Wikipedia back in Pakistan

Contribute  —  
Share this
By Adam Cuerden, Andreas Kolbe, Bri, EpicPupper, Red-tailed hawk and Smallbones

Terms of Use update

The logo of the Wikimedia Foundation

Consultations that should lead to an updating of the Wikimedia Foundation's Terms of Use will begin as early as this Tuesday, February 21 according to WMF Legal. The discussions will take place on Meta-Wiki.

The updates are necessary in order to:

Other possible changes include strengthening enforcement of the requirement that paid editors must declare their employers, clients, and affiliations. S

Results of the Universal Code of Conduct Enforcement Guidelines vote

Wikimedia Foundation Board Member Shani Evenstein announced the results of the vote on the Wikimedia-l mailing list on 13 February 2023:

Today the results of the Universal Code of Conduct Enforcement Guidelines vote were tallied. We are pleased to report the results show that the Enforcement Guidelines are strongly supported by the community, with 76% of participants voting in support of the Enforcement Guidelines.

3,097 voters from 146 Wikimedia communities participated. Statistics for the vote are available. A more detailed summary of comments submitted during the vote will be published soon.

Next, the results and comments collected during this vote will be submitted to the Board of Trustees for their review. It is expected that the Board of Trustees review process will complete in March 2023, at which time another update will be published. AK

Steward elections and confirmations underway

Placeholder alt text
Stewards are a global group of users with complete access to the wiki interface on all public Wikimedia wikis

The Steward elections are underway. New candidates for Stewardship are:

The Steward confirmations – an annual performance evaluation by the community that determines whether or not existing Stewards will retain their status for another year – are also currently underway.

Voting for the election and confirmation processes will remain open until 26 February 2023, 21:00 (UTC). AK

Prime Minister of Pakistan intervenes to unblock Wikipedia

Man in his seventies wearing a great jacket, blue shirt and dark tie
Shehbaz Sharif, the Prime Minister of Pakistan, decided that Pakistan had thrown out the baby with bathwater and ordered access to Wikipedia restored

Now you see it, now you don't – Wikipedia had hardly been blocked by Pakistan before it was unblocked.

Shehbaz Sharif, the Prime Minister of Pakistan, stepped in and convened a committee composed of the Minister for Law and Justice, the Minister for Economic Affairs and Political Affairs and the Minister for Information and Broadcasting to examine the matter.

The committee took the view that –

Wikipedia was a useful site/portal which supported dissemination of knowledge and information for the general public, students and the academia. Blocking the site in its entirety was not a suitable measure to restrict access to some objectionable contents/sacrilegious matter on it. The unintended consequences of this blanket ban, therefore, outweigh its benefits.

The government subsequently barred the Pakistan Telecommunica­tion Authority (PTA) from blocking any website without first consulting the Minis­try of Information Tech­nology and Tele­communi­cation.

Wikimedia Foundation General Counsel Stephen LaPorte said on the Wikimedia-l mailing list:

We received news yesterday that the Pakistan Telecommunications Agency was directed to restore access to Wikimedia projects, including Wikipedia, in Pakistan. Our internal traffic reports confirm that the block has been lifted and we've updated our public statement.

We're happy that the people of Pakistan can continue to use Wikipedia to access knowledge and share their experiences and culture with the world. These situations are a reminder of the importance of the work of our movement and the vital role our communities play in building the largest collection of open knowledge in history. It also underscores the need to protect people's right to access free knowledge and participate in contributing to its growth globally. Thank you for your support. We'll continue to keep you updated if there are any new developments.

AK, R

For the media coverage of this event, see this issue's in the media.

AI takeoff begins: the first wave of bullshit

A bull with broad horns standing in a field, regarding the photographer.
Another BS generator without a sense of what is real and what is fake

Last issue, we reported on AI-generated articles on Wikipedia. It was a bit of fun. Find the AI-generated article! The impostor is sus!

If only it remained that way: English Wikipedia editor Rory Jaffe (rsjaffe) has found a number of them, including Draft:Cow Farts – not actual bull shit, but pretty close. Here's what he says about his discovery technique.

I've been looking for "tells" before testing [with a classifier tool that can detect AI generated text with some accuracy]. The text is more "lifeless" than human text typically is, tends to have uniform length sentences, and may have a paragraph at the end that is a summary paragraph. The text is also unlikely to have in-line references. This doesn't catch all of them but tends to be a high-likelihood way of finding them. Some of these AI-generated articles also have AI-generated references. The references are almost always fake: the AI confabulates reasonable-sounding references!

Articles or drafts tagged by Jaffe and reviewed by The Signpost were created as early as 17 December 2022 (Draft:Brisbane Lord Mayor YAC). The topics tagged by Jaffe and others range from the whimsical and neologism-ic (Draft:Twinkanic Awareness, Draft:Sharks should be saved from humans) to the plausibly- and obviously-promotional, and appear to be created by many editors, including single-purpose accounts that may be using them for promotion. Take Draft:Dachshund harness, "an essential accessory for owners of this unique breed of dog", which has a list of references at the end related to shopping for such a device. Given affiliate marketing exists, this could be promotional. Or it could be purely accidental. We don't know. Some of the rest may be deleted by the time you read this – a few examples are retained permanently at User:JPxG/LLM dungeon.

The Signpost asked Jaffe "Do you have a personal reaction to the quality of the AI generated articles? Briefly, are you for or against this method of creation?" He told us this:

Strongly against. These language generators are good at stringing together reasonable-sounding text, but:

1. Don't rely solely on what Wikipedia recognizes as reliable sources.

2. Don't have a sense of what is real and what is fake.

3. Will fabricate information to fill in gaps.

4. Cannot identify where any specific bit of "information" came from.

So the text can fail WP:RS and WP:V, yet may look very convincing.

However, they're good at helping to break "writer's block", by giving an example of how to write something. So I see having some text generated by ChatGPT would be helpful as a writing prompt, but without using any of the generated text directly—just looking at it and then writing a well-researched article similar to the prompt.

More on Jaffe's experience and conclusions on this topic can be found at the Essay in this issue.

We asked ChatGPT for a commentary as well. It said:

It's interesting to see that people are now using AI to generate articles on Wikipedia, but it's also concerning as this could lead to a proliferation of inaccurate or fake information on the platform. It's good to see that editors like Rory Jaffe are using techniques to identify AI-generated text, but it's clear that the AI-generated articles are becoming more sophisticated in their structure and references, making it harder to detect them.

It's important for Wikipedia to maintain its high standards of accuracy and reliability, so it's crucial that the platform continues to monitor and regulate the use of AI-generated content. While AI can be a valuable tool for assisting with research and writing, it should never be a substitute for human expertise and critical thinking.

B, AC

Wikimedia Enterprise financial report

Wikimedia Enterprise logo, a stylised blue/violet/red squirrel looking right, with the words WIKIMEDIA ENTERPRISE written below in black
The Wikimedia Enterprise project sells API services to Google and provides them free of charge to the Internet Archive. The team has not released information on any additional customers acquired since then.

Wikimedia Enterprise has published its first detailed financial report, complete with a summary of product updates.

Revenue in the 2022 calendar year (from subscription and professional services) totaled $3.12 million, vs. expenses of $3.18 million ($1.06 million in cost of services and $2.12 million in operating expenses), for a net loss of $60,000.

The Enterprise team points out that the present report for the 2022 calendar year should be considered a "beta" edition. The Wikimedia Foundation normally operates on a financial year starting in July and finishing the following June. But as January 1, 2022 marked the official start of commercial operation of the Enterprise project, this first report covers the 2022 calendar year. For future financial reports the Enterprise team intends to align with the normal financial reporting schedule of the rest of the Wikimedia Foundation. The next report will be published in late 2023.

The report adds that:

As the LLC is wholly owned by the Wikimedia Foundation, all of the financial information presented here will also be included within the Wikimedia Foundation's audited financial statements and will be in the next Wikimedia Foundation "Form 990" filing as it relates to fiscal year 2021–2022, and future years.

The Enterprise team says it is very happy with its first year in business and looks forward to growth and profitability this year:

In 2023, our second year of operations, with the addition of new customers and also new features, both revenue and expenses are expected to increase – but revenue growth is expected to outpace expenses. By comparison to other startup commercial API projects, to reach this stage within one year of operations is extremely rapid progress. The governance of these and all other commercial customer relationships is consistent with how the Wikimedia Foundation treats large corporate donations.

No information has been released on how many paying and non-paying customers Wikimedia Enterprise has at the moment. Corresponding inquiries on Meta-Wiki have been answered as follows:

As stated in the original press release, Google and the Internet Archive are indeed the first to receive paid and free access (respectively) but we have not publicized the subsequent customers (paid or free) who have signed-up to the service. ... Maintaining a public and comprehensive list of paying and free/trial customers would look like advertising/promotion of those customers, and also introduce a new privacy (and potentially security) problem: i.e. in the same way that it would be inappropriate to make a public list of "all people who have used the Wikidata Query Service this month" (for example) – it goes against our privacy culture. Nonetheless, we do intend to be making "use case" blog posts – which will describe how some users (either general categories or individual cases with their permission) are benefiting from the service in the real-world.

Video recording of the 10 February Zoom meeting

The Enterprise API features are designed with large commercial reusers in mind – primarily search engines and voice assistants, although education (question-and-answer modules), finance and the fast-growing AI sector represent additional market opportunities the team may explore.

But other kinds of reusers, including individual volunteer Wikimedians, can also benefit. There are several access methods to the datasets available at no cost/no registration, including a free Wikimedia Enterprise account via the project's homepage. Ongoing updates for the community can be found on the project's MediaWiki page.

The publication of the financial report was followed by a public Zoom meeting on 10 February, a recording of which is available. Click the picture on the right or go to the Wikimedia Enterprise talk page on Meta-Wiki, which also includes a list of questions discussed during the meeting and their approximate timecodes. The file description in Commons even includes an automated transcript. – AK

Tides Advocacy

In response to multiple inquiries, the Wikimedia Foundation has posted a breakdown of how and when Tides Advocacy spent the $4.223 million Annual Plan Grant portion of the $8.723 million the organisation received from the WMF in the 2019–2020 financial year. – AK

Brief notes

Episode 8 of Wikimove's podcast series covers movement strategy and features three guests from Africa.



Reader comments

2023-02-20

Arbitrators open case after article alleges Wikipedia "intentionally distorts" Holocaust coverage

Contribute  —  
Share this
By Andreas Kolbe, Adam Cuerden, and Bri

Wikipedians rebut paper alleging "intentional distortion" of Holocaust history

Jan Grabowski of the University of Ottawa, one of the two authors of the paper

An essay published on 9 February 2023 in The Journal of Holocaust Research and reported on by Haaretz and Polish daily Wyborcza, as well as San Diego Jewish World and Ynet, alleges that Wikipedia engages in "intentional distortion of the history of the Holocaust". The abstract of the essay, written by Jan Grabowski of the Department of History at the University of Ottawa and Shira Klein of the Department of History at Chapman University in Orange, California, says:

This essay uncovers the systematic, intentional distortion of Holocaust history on the English-language Wikipedia, the world’s largest encyclopedia. In the last decade, a group of committed Wikipedia editors have been promoting a skewed version of history on Wikipedia, one touted by right-wing Polish nationalists, which whitewashes the role of Polish society in the Holocaust and bolsters stereotypes about Jews. Due to this group's zealous handiwork, Wikipedia's articles on the Holocaust in Poland minimize Polish antisemitism, exaggerate the Poles' role in saving Jews, insinuate that most Jews supported Communism and conspired with Communists to betray Poles (Żydokomuna or Judeo–Bolshevism), blame Jews for their own persecution, and inflate Jewish collaboration with the Nazis. To explain how distortionist editors have succeeded in imposing this narrative, despite the efforts of opposing editors to correct it, we employ an innovative methodology. We examine 25 public-facing Wikipedia articles and nearly 300 of Wikipedia’s back pages, including talk pages, noticeboards, and arbitration cases. We complement these with interviews of editors in the field and statistical data gleaned through Wikipedia's tool suites. This essay contributes to the study of Holocaust memory, revealing the digital mechanisms by which ideological zeal, prejudice, and bias trump reason and historical accuracy. More broadly, we break new ground in the field of the digital humanities, modelling an in-depth examination of how Wikipedia editors negotiate and manufacture information for the rest of the world to consume.

On 13 February 2023, Wikipedia's primary disciplinary body, the Arbitration Committee, took the unusual step of initiating a case request sua sponte in response to the essay, "invoking its jurisdiction over all matters previously heard and exercising its authority to revisit any proceeding at any time at its sole discretion." The topic area – including many of the edits and behaviours discussed by Grabowski and Klein – has been the subject of multiple arbitration proceedings before, from the 2009 Eastern European mailing list case to the 2019 Antisemitism in Poland case.

On 15 February 2023, Wyborcza (the Polish newspaper that carried Grabowski and Klein's summary of their essay) published a rebuttal by Piotr Konieczny of the Department of Media and Social Informatics at Hanyang University, one of the Wikipedians (User:Piotrus) named by Grabowski and Klein. Piotr said the essay contained many assertions of fact that were not borne out by edit histories recorded on Wikipedia, as well as instances of selective quoting. For those who don't subscribe to Wyborcza – the paper is paywalled – the text of the rebuttal is available here. A longer, English-language response by Piotr raising some of the same issues is here. Volunteer Marek, another editor named in the essay, has also published a multi-part response in English on his Substack.

Watch out for an independent review of the paper in the upcoming issue of the Signpost's monthly "Recent research" section. In the meantime, see also previous Signpost coverage of similar complaints raised in 2019. – AK

"Why we should be wary of Wikipedia"

Placeholder alt text
Investigative journalist Russ Baker

This is the premise of an article series investigative journalist Russ Baker kicked off on 6 February 2023 with a piece on his whowhatwhy.org website.

Baker is a veteran reporter who has written for top publications like The New Yorker and The Washington Post. He has tussled with the Church of Scientology. In 2005, he won the Deadline Club award for his exclusive reporting on George W. Bush's military record. Baker was among the first to cast doubt on Colin Powell's now-infamous presentation on Iraq at the United Nations – at the time a very unpopular stance – and among the first to make Americans aware of the impending genocide in Rwanda. But presently, he is concerned about Wikipedia's biographies.

Baker notes that discussions of bias on Wikipedia have generally focused on its alleged "white, American" bias as well as its alleged "leftist" bias:

But none of these critiques really get at what I'm talking about: how professional or amateur "hit men" can infiltrate Wikipedia and go after individuals and ruin them in the public eye.

Some years ago, when I began researching this, I found very little online about this phenomenon, despite the fact that I knew a fair number of individuals who had been victims of the practice. Now that I look again, I still see no sign that this problem is being addressed or even vestigially discussed.

What this means is, nobody is minding the store to make sure that we don’t end up in some type of artificial informational construct that edits the facts about powerful actors and institutions to conform to a subjective agenda instead of reality.

Baker feels that anti-establishment actors like himself are particularly likely to suffer, given that Wikipedia uses the mainstream media as its arbiter, and these media are in many ways an integral part of the establishment.

The fact is, anyone who is out there "making trouble" for the system doesn't stand a chance. Why? Because it would take a relentless, inhuman vigilance to battle those persistent and tidal forces bent on controlling the narrative … And most of us don't have the time, expertise, or energy to do that.

Also, because, for someone to make their case, they have to prove that good things have been said about them … by the establishment.

In other words, if The Washington Post likes you, then you appear in a positive light on Wikipedia. If it doesn't, then what the public sees on the platform is a person or entity it should apparently not like or trust.

In the second part (published a week later) Baker looks at his own Wikipedia biography, which he says is highly selective in a way that is unfavourable to him.

And yes, I received several scathing reviews from establishment organs, but my Wikipedia page never quoted any of the good ones I received — and from prominent people. In fact, they implied there were none. Here’s what you won’t find on Wikipedia:

  • One of the most important books of the past ten years. — Gore Vidal
  • An investigative gem filled with juicy revelations. — Sydney Schanberg, Pulitzer Prize winner, The New York Times
  • A tour de force… Family of Secrets has made me rethink even those events I witnessed with my own eyes. — Dan Rather
  • Russ Baker's work stands out for its fierce independence, fact-based reporting, and concern for what matters most to our democracy… A lot of us look to Russ to tell us what we didn't know. — Bill Moyers
  • This is the book people will be mining for years to come. — David Margolick, Newsweek and Vanity Fair

There is no reason to believe these quotes aren't genuine – Schanberg, for example, joined Baker for readings of the book in question. The problem is that these are "Praise for ..." quotes from a marketing blurb rather than quotes taken from published reviews. Wikipedians would generally avoid citing marketing materials, and look for independently published reviews in the press. So, is Baker merely whining?

Well, no. Reading the fairly sympathetic Boston magazine article quoted in Baker's biography, it's hard to escape the notion that editors selected quotes to construct a narrative completely at odds with the overall tenor of the cited article. The Boston article concludes by asking, in light of important stories broken by Baker in the past, "which is more dangerous, listening to Russ Baker, or ignoring him?" – AK

"Share profits with authors!"

OpenAI, the creators of the generative pre-trained transformer (of which ChatGPT is one) – reportedly paying its clickworkers in India and Africa $2 an hour

This is the provocative title of an article in Germany's Der Tagesspiegel newspaper, opining that generative models like ChatGPT that create text, images and music are committing "data theft" and leaving creators "naked". The article discusses the unsung contributions of the many:

Wikipedia authors, book authors, illustrators, editors, photographers. Their work creates the raw materials that then enters an industrial process: the training data used to feed the AI.

Tech companies like Google and Amazon have used free and open internet content like the English-language Wikipedia as a quarry for years, without giving the authors or organisations a share. The paltry sums that Google and Amazon donate to the Wikimedia Foundation are dwarfed by the economic benefit these corporations derive from the online encyclopaedia.

Now, it has always been an inherent flaw of the commons idea that profit-oriented actors are as welcome to benefit from non-profit work as the general public. The "tragedy of the commons" dilemma is well known from economics. One cannot forbid Amazon to train its voice assistant Alexa with Wikipedia texts – or Wikipedia would have to jettison its foundational principles overnight.

But the relentlessness with which tech companies graze the digital commons and use it to feed their own business models raises the question under what circumstances commons will continue to be produced in the future. Who will maintain Wikipedia articles if they are used for commercialised search queries or answer modules? Who will still write books if language models glue together set pieces into third-rate novels and publishers use them to fill their portfolios?

Der Tagespiegel proposes a compensation system that gives authors an appropriate share in AI systems' profits, citing Germany's long-established VG Wort (cf. Authors' Licensing and Collecting Society) as an example. This has authors registering and then being routinely compensated with fees collected from re-users of their works, according to a complex allocation formula.

After all, no one would have a street artist paint their portrait and then, after taking a digital picture of it and editing it with an AI-based Instafilter, tell the painter, without paying, "Thanks a lot, that was fun!" Respect for art is also expressed through decent payment.

(VG Wort has previously already indicated that Wikipedia would be eligible for payments under its existing system. However, in a 2011 poll the German Wikipedia community overwhelmingly voted against participating in the scheme as a website, although some individual Wikipedia editors were collecting payments separately for "their" articles, amounting to 300 Euros in one case. See previous Signpost coverage: "German Wikipedians reject author payments scheme".)

The Tagesspiegel article ends by noting that OpenAI, the developer of ChatGPT, employs thousands of clickworkers in Uganda, Kenya and India, who label potentially offensive text to help train the models (including violent or sexual text). In Kenya, where average wage is about $18 per day, these workers are paid less than $2 an hour. – AK

Wikipedia blocked and unblocked in Pakistan

As discussed in this issue's news and notes, Wikipedia is back in Pakistan after a fairly brief block. The Prime Minister found that "the unintended consequences of this blanket ban outweigh its benefits", and formed a new committee to look at technical measures for selectively blocking specific objectionable content. However, back in 2015 Wikipedia switched to HTTPS, specifically to make it more difficult for ISPs and other men-in-the-middle to know what part of Wikipedia you're browsing. Numerous reports were made on the subject, including Dawn (again here), Bloomberg News, ABC News, Al Jazeera, and NPR AC

Wikimedia Foundation vs. NSA

The Washington Examiner reports that next week, the United States Supreme Court justices will decide whether to take up the longstanding case between the Wikimedia Foundation and the National Security Agency. "To this day, no public court has determined whether upstream surveillance complies with the Constitution. If the government can obtain dismissal here, it will have every incentive to make overstated or exaggerated claims of secrecy to close the courthouse doors on suits like Wikimedia's – suits seeking accountability for government overreach or abuse in the name of national security," the article's author, Bob Goodlatte, states. – AK

UPDATE: The Supreme Court denied the Wikimedia Foundation's petition on 21 February 2023, marking the end of the case. – AK

In brief

Former Wikimedia CEO Katherine Maher



Do you want to contribute to "In the media" by writing a story or even just an "in brief" item? Edit next week's edition in the Newsroom or leave a tip on the suggestions page.




Reader comments

2023-02-20

The "largest con in corporate history"?

Contribute  —  
Share this
By Smallbones
Related articles
Does Wikipedia pay?

How paid editors squeeze you dry
31 January 2024

"Wikipedia and the assault on history"
4 December 2023

The "largest con in corporate history"?
20 February 2023

Truth or consequences? A tough month for truth
31 August 2022

The oligarchs' socks
27 March 2022

Fuzzy-headed government editing
30 January 2022

Denial: climate change, mass killings and pornography
29 November 2021

Paid promotional paragraphs in German parliamentary pages
26 September 2021

Enough time left to vote! IP ban
29 August 2021

Paid editing by a former head of state's business enterprise
25 April 2021


More articles

A "billionaire battle" on Wikipedia: Sex, lies, and video
28 February 2021

Concealment, data journalism, a non-pig farmer, and some Bluetick Hounds
28 December 2020

How billionaires rewrite Wikipedia
29 November 2020

Ban on IPs on ptwiki, paid editing for Tatarstan, IP masking
1 November 2020

Paid editing with political connections
27 September 2020

WIPO, Seigenthaler incident 15 years later
27 September 2020

Wikipedia for promotional purposes?
30 August 2020

Dog days gone bad
2 August 2020

Fox News, a flight of RfAs, and banning policy
2 August 2020

Some strange people edit Wikipedia for money
2 August 2020

Trying to find COI or paid editors? Just read the news
28 June 2020

Automatic detection of covert paid editing; Wiki Workshop 2020
31 May 2020

2019 Picture of the Year, 200 French paid editing accounts blocked, 10 years of Guild Copyediting
31 May 2020

English Wikipedia community's conclusions on talk pages
30 April 2019

Women's history month
31 March 2019

Court-ordered article redaction, paid editing, and rock stars
1 December 2018

Kalanick's nipples; Episode #138 of Drama on the Hill
23 June 2017

Massive paid editing network unearthed on the English Wikipedia
2 September 2015

Orangemoody sockpuppet case sparks widespread coverage
2 September 2015

Paid editing; traffic drop; Nicki Minaj
12 August 2015

Community voices on paid editing
12 August 2015

On paid editing and advocacy: when the Bright Line fails to shine, and what we can do about it
15 July 2015

Turkish Wikipedia censorship; "Can Wikipedia survive?"; PR editing
24 June 2015

A quick way of becoming an admin
17 June 2015

Meet a paid editor
4 March 2015

Is Wikipedia for sale?
4 February 2015

Shifting values in the paid content debate; cross-language bot detection
30 July 2014

With paid advocacy in its sights, the Wikimedia Foundation amends their terms of use
18 June 2014

Does Wikipedia Pay? The Moderator: William Beutler
11 June 2014

PR agencies commit to ethical interactions with Wikipedia
11 June 2014

Should Wikimedia modify its terms of use to require disclosure?
26 February 2014

Foundation takes aim at undisclosed paid editing; Greek Wikipedia editor faces down legal challenge
19 February 2014

Special report: Contesting contests
29 January 2014

WMF employee forced out over "paid advocacy editing"
8 January 2014

Foundation to Wiki-PR: cease and desist; Arbitration Committee elections starting
20 November 2013

More discussion of paid advocacy, upcoming arbitrator elections, research hackathon, and more
23 October 2013

Vice on Wiki-PR's paid advocacy; Featured list elections begin
16 October 2013

Ada Lovelace Day, paid advocacy on Wikipedia, sidebar update, and more
16 October 2013

Wiki-PR's extensive network of clandestine paid advocacy exposed
9 October 2013

Q&A on Public Relations and Wikipedia
25 September 2013

PR firm accused of editing Wikipedia for government clients; can Wikipedia predict the stock market?
13 May 2013

Court ruling complicates the paid-editing debate
12 November 2012

Does Wikipedia Pay? The Founder: Jimmy Wales
1 October 2012

Does Wikipedia pay? The skeptic: Orange Mike
23 July 2012

Does Wikipedia Pay? The Communicator: Phil Gomes
7 May 2012

Does Wikipedia Pay? The Consultant: Pete Forsyth
30 April 2012

Showdown as featured article writer openly solicits commercial opportunities
30 April 2012

Does Wikipedia Pay? The Facilitator: Silver seren
16 April 2012

Wikimedia announcements, Wikipedia advertising, and more!
26 April 2010

License update, Google Translate, GLAM conference, Paid editing
15 June 2009

Report of diploma mill offering pay for edits
12 March 2007

AstroTurf PR firm discovered astroturfing
5 February 2007

Account used to create paid corporate entries shut down
9 October 2006

Editing for hire leads to intervention
14 August 2006

Proposal to pay editors for contributions
24 April 2006

German Wikipedia introduces incentive scheme
18 July 2005



Reader comments

2023-02-20

Machine-written articles: a new challenge for Wikipedia


Interview with ChatGPT

Artificial Intelligence.

This interaction with ChatGPT was recorded on January 29, 2023

Author: Are any Wikipedia articles written by ChatGPT?

ChatGPT: No, ChatGPT is not capable of writing articles on Wikipedia. It can only generate text based on the input it receives and its training data.

Author: What should I do with a Wikipedia article someone wrote using ChatGPT?

ChatGPT: Articles generated by language models like ChatGPT are not suitable for use on Wikipedia as they are not written by humans and lack context and accuracy. If you have come across such an article on Wikipedia, you should report it to the Wikipedia community for review and potential removal, as Wikipedia has strict policies regarding the sources and reliability of the information on the platform.

My Experience

Near the end of January, I ran across some odd-sounding articles in draftspace. All the articles had the same structure and layout, even though they were about different things. The articles also had impeccable English, though the author's comments did not. Suspecting something was up, I found an artificial intelligence (AI) article detector that then pronounced all these articles machine generated with greater than 99.9% confidence. Not knowing what to do, I went to the calm environ of the Administrators' noticeboard for incidents. That discussion is worth reading, as it shows a bunch of knowledgable Wikipedians struggling and debating how to deal with these articles. Were they hoaxes? Copyright violations? Good for publication? How does generated text fit within Wikipedia's requirements for articles?

Policies?

As it turns out, an effort was already underway to develop a policy regarding articles written by ChatGPT and its relatives: Wikipedia:Large language models. There, and in its associated talk page, you can see the reasoning related to these articles. In short, AI-generated text is not reliably correct, may not have a neutral point of view, needs verification, can occasionally violate copyright, and can downright lie. This is all in its inherent nature. It is fed information from a large corpus of text, much of which would not meet Wikipedia's sourcing and neutrality criteria, and it synthesizes its output without regard as to whether the text maps to a real source. To quote the ChatGPT general FAQ: "These models were trained on vast amounts of data from the internet written by humans, including conversations, so the responses it provides may sound human-like. It is important to keep in mind that this is a direct result of the system's design (i.e. maximizing the similarity between outputs and the dataset the models were trained on) and that such outputs may be inaccurate, untruthful, and otherwise misleading at times."

Finding More

I then started going through recent drafts and new articles looking for text reminiscent of the text I had seen in the first articles I identified. It didn't take long to find more. The current limiting factor is that I don't have the tools for rapidly reviewing Drafts as I do for new articles as a new page patroller, and I don't have the resources (including time and patience) needed to do this consistently and daily.

I've put the {{AI-generated}} template on those articles and had not one author disagree with the finding. You can search for the template with hastemplate:AI-generated in the Wikipedia search box. Expand the search to Drafts to see the drafts so marked. About sixty articles have been tagged. Several of the previously tagged articles have either been deleted or de-tagged once the generated text was replaced with real text, but many remain. You can then read those remaining examples and get a feel for AI-generated articles.

Editor's note: If you are reading this in the future and none of these are available, some representative drafts have been retained more permanently at User:JPxG/LLM dungeon.

I have been conservative in identifying articles: only testing articles that had a similar appearance and using > 99% assurance that it was machine-generated. I am sure I've missed many more articles. I was looking for typical phrases like "in conclusion" starting the last paragraph, use of the article's title repetitively without abbreviation or variation, and consistent sentence and paragraph length. A more sophisticated AI user would use better prompts to the AI software and produce harder-to-detect output; the ones I've found typically were produced by asking ChatGPT something like "Write a Wikipedia article about XXXX."

I test for articles typically using https://openai-openai-detector.hf.space/, though other sites exist, including https://detector.dng.ai/, https://gptzero.me/, https://platform.openai.com/ai-text-classifier, and https://contentatscale.ai/ai-content-detector/. Before testing, I remove headings, inline references, and other text and markup that appears to have been added after text generation, as those can confuse the analyzer.

Problem?

Is this a real problem? I believe it is. Many of the articles sound reasonable but may have serious errors. The conversation at the administrators' noticeboard includes an analysis of a generated article on geckos. The article contains a lot of specific plausible-sounding information (e.g., size range), much of which is wrong.

ChatGPT will even provide references if requested, but those references are synthesized from its input text and, while sounding correct, usually do not point to real articles. For example, when I asked it for references on an article ChatGPT wrote for me on Sabethes cyaneus (a mosquito), one of the references was "Sabethes cyaneus" (Encyclopedia of Life): https://eol.org/pages/133674. That page does exist, but is for Clavaria flavopurpurea, a fungus. Another reference it provided also had a link, but the link pointed to an article about a mink, and the reference itself was fictional.

There are efforts to improve these programs, and I am sure that eventually they will be successful. Currently, however, those efforts fall well short. One such effort, "Elicit" (elicit.org), only searches research papers and summarizes them. I asked Elicit "What are the characteristics of Sabethes cyaneus?" It summarized one reference as "Sabethes cyaneus is a species of frog."

These false but plausible answers are an inherent property of the current models and is called "hallucination".

Even once these programs improve, there will still be significant concerns limiting the direct use of generated text, such as:

  1. Did the program only use sources acceptable to Wikipedia?
  2. Is the information up to date?
  3. Can the program identify correct references? (One of the biggest hurdles, as the inherent nature of current models is that the output is not linked to a specific source.)
  4. Will the program avoid hallucinating?

However, programs like ChatGPT are great for generating ideas for articles and helping to "mock up" a good article, if the user takes a sophisticated view of the output, using it more for inspiration than for a source of truth. The proposed policy Wikipedia:Large language models has additional information on how these programs can be used to improve Wikipedia.



Reader comments

2023-02-20

All about writing at DYK

This article was originally published at User:Theleekycauldron/Essays/Assessing DYK hooks

Summary
  • The best hooks are the ones that leave the reader wanting to know more, leading them to click on the bolded article to learn more. Making your hooks concise and punchy is a fantastic aid in this.
  • The DYK statistics pages can tell you what our readers like to see, though this ranges between a useful tool and something to be taken with a grain of salt. Reading through past reviews can help you understand DYK's attitudes towards certain hooks.
  • The outcomes of some hooks are determined by their topic, rather than their substances – it is sometimes helpful to put that aside when reviewing and writing a hook. Avoid relying solely on an eye-catching topic to craft a hook.

Hi there, I'm theleekycauldron! I often frequent Did you know (DYK), the section of the Main Page that publishes "hooks" designed to reel readers into clicking on newly created and expanded content. I've worked with over 2,500 DYK nominations as builder of the sets of nominations for the Main Page, I maintain monthly statistics pages for DYK hooks, and I have some 50 hooks of my own. So, I thought I'd share some insight about how I approach writing hooks and assessing whether they're ripe for the Main Page; I hope it serves helpful in your own adventures of hook writing, reviewing, promoting, and – if you're an admin – promoting again.

DYK has always had a rather fraught and subjective relationship with how it accomplishes its goals: it wants its hooks to be interesting, and catchy, and likely to draw in readers, but it also wants to make space for editors who create content all of kinds (regardless of whether some part of it can be boiled down to 200 characters). Those tensions influence exactly where the project draws the line on what "interesting" means. Its current guideline is codified at part 3a of the main ruleset, which states that the hook should be "likely to be perceived as unusual or intriguing by readers with no special knowledge or interest". Despite a 2022 request for comment clarifying and strengthening the wording of this guideline, what actually counts as "interesting" remains a perennial flash point at DYK, with lots of people weighing in through different lenses.

In this essay, the top-level headings are sorted in descending order of importance, as are each of the sub-level headings within them. Also, this essay contains several examples of hooks to emulate (in green), hooks to avoid (in a pinkish red), and hooks to consider with caution (in yellow). To avoid embarrassment to our hardworking nominators, reviewers, promoters, and promoting admins, all of these hooks and their boldlinked articles are my own work.[a] If you would like to submit a hook of yours that you believes makes for a better example, do leave it on the talk page! Bonus points and credit if you submit a hook of yours that you believe makes for a good example of a hook to avoid.

Literary quality

The bare bones of a good hook

raise questions ...

... that Demi Lovato started an Internet feud with a frozen yogurt shop—and lost?

... instead of neat facts

... that a resolution introduced into the Nebraska Legislature by Joni Albrecht praised Julie Schmit-Albin as "never one to let a public official waffle on pro-life legislation"?

and don't be boring

... that "leek rust", caused by Puccinia allii, has also caused significant losses for garlic farmers?

What I consider most important to evaluating a DYK hook is assessing it in terms of its quality as a kind of very-short-form literary work. DYK is a part of Wikipedia, and Wikipedia is an encyclopedia, but DYK doesn't look to inform comprehensively the way an encyclopedic article does; it looks to be "hooky" via being informative.[b] That's best thought of as narrative storytelling, rather than strictly MOS-compliant article prose.

Like any good story, hooks are successful when they create a tension in the mind of the reader, a tension that keeps them reading until it is resolved in a satisfying manner. What's different about DYK is that the hook is what's meant to create that literary tension, and clicking through to read the bolded article is what's meant to resolve that tension. The hook leaves our reader wanting; the article gives them what they're looking for, and along the way, we've tricked them into learning about something new and exciting. When I look at any hook, I always try to ask myself these three questions:

  1. Which aspects of this hook create tension, sparking a question in the mind of the reader?[c]
  2. Why should the reader want to know more about this story, rather than going about their day as normal?[d]
  3. When the reader clicks through to read the bolded article, what are they looking for?[e]

These questions can be pretty hard to tease apart sometimes, so you can answer them in one fell swoop if that works for you. However, when I can't come up with a convincing answer to at least one of these three questions, I try – at the very least – to ask the hook writer what it is they're trying to convey. Maybe the hook can be workshopped into something presentable, maybe a different angle can be found, or maybe that line of thought just needs to be scrapped entirely. These three questions are the only hard lines I draw when it comes to hook interestingness: every hook needs to come up with some answer to them. Otherwise, we're not airing a hook, we're just publishing a fact – and that's no good.

Wording and formatting

be concise ...

... that Wikipedia editors wrote over 40,000 words arguing over a single letter?

... not wordy

... that in 2013, editors of the English Wikipedia had an argument on whether to capitalize the 'i' in Star Trek Into Darkness?

and avoid jargon

... that Wikipedia editors wrote over 40,000 words arguing over a single letter on a talk page before reaching a consensus?

Once I have my answer to question 1, I find it a lot easier to give a hook a solid copyedit (rather than just looking for rote formatting errors), since knowing what's important about a hook lets you focus on making that part stand out.[f] Prep set builders often have a lot more leeway than reviewers to modify hooks unilaterally (subject to review and reversion at DYK's noticeboard, of course), so it's important for them to have a strong understanding of how hooks should read on the Main Page and copyedit accordingly. Getting that understanding just takes practice: trust your gut, do it over and over again, and you'll eventually develop a good style.

In general, if it takes me a couple read-throughs to figure out what's going on in a hook, that's a sign that it might be too complex or have more details than necessary. As a reviewer or promoter, I'm spending a lot more time reading and rereading the hook than DYK is going to get out of the average reader who likely merely stumbles upon a hook while glancing at Wikipedia's Main Page. If we've got their attention for one readthrough, we need to make it clear pretty quickly why they might want to stick around. If they get confused, or lost, they're gonna flit elsewhere.

As part J11 of DYK's supplementary guidelines state, nominators often feel that they have to give a lot more context than they do to get the point across. When copyediting a hook, I try to think about how to narrow down a hook to its essentials when I find it to be bloated. Similarly, if jargon or other unclear terms can be rephrased in a way that make the hook easier to read, doing so probably increases reader retention. If you read a hook out loud to yourself, and you find yourself unnecessarily tripping over the words, that might be a good sign that it could use a copyedit. One place to start might be cutting down on the number of pauses and commas in a hook by figuring out how to smooth out the language, or remove unnecessary details.

Precedent

Pageview precedent

I think it's worthwhile to study the performance of past hooks. Since 2015, we've had the ability to peer into the collective minds of our readers and find out what they respond to and what they don't. More recently, automation tools have made the data widely accessible to anyone looking to analyze and draw conclusions from the mountains of data we've collected.[g] Pageviews are a potent way for DYK regulars to come together and reassess the best way to get readers to pay attention to the fantastic new content our editors are producing every day. If you're a data junkie, like I am, I highly recommend going through the stats pages and taking notes about what kinds of hooks stand out to you, for better or for worse.[h]

Pageview stats come with a huge asterisk attached, though – not every hook that performs well should be emulated, and not every hook that performs poorly should be trashed. How well a hook does on the Main Page isn't just about the hook's quality of writing; it's also about its topic, prior fame, placement in a set, time of day, and so on. Any analysis of a hook should to be performed holistically, and not just numerically; pageviews are only one of the lenses in the toolbox, and, while they are an important one, consistently prioritizing pageviews over sound editorial judgement leads us to a bad place. Exercise caution before using pageviews as an argument for or against running a hook, and remember to consider other factors.

That being said, our goal is to showcase new content, and we can't do that if our readers aren't interested. It's a balancing act.

Reviewer precedent

I personally believe that our readers come first, and that we write and rewrite this section of the Main Page primarily for their education and edification, rather than our own. But when I find that I can't get behind the judgement of past reviewers, that's definitely a cause for me to step back and reassess. Sometimes, I come out of that reassessment without having changed my mind, but at least I thought about it. Reviewing why I might approach a hook the way I do, why others approach it differently, and whether I think it's time for a paradigm shift are all good ways to examine my own biases against DYK's consensus and all this makes for useful criteria when I'm thinking about whether to object to a hook.

One reason for pushing against the current meta might be that reviewers have grown too accustomed to an old format that is familiar, rather than quality; another might be that the majority of reviewers don't necessarily want to put their weight behind what might be a tense discussion about whether a hook is interesting. Telling someone that their hook isn't working is never fun, especially when they're pushing back against a nominator that doesn't agree. One reason to simply not comment or go with the flow might be that the disagreement you're holding isn't worth challenging what might be a long-accepted notion, which is just getting yourself an uphill battle that might be for naught. Such are the things to consider when you add your review, promotion, or queue move to the piles of precedents that shape DYK's future hooks.

Topic bias and noteworthiness

sexier topics sell ...

... that it's pronounced "gif", not "gif"?

... sometimes without great hooks

... that Darth Vader's anal shield has a "pronounced bell shape"?[i]

and non-obvious topics can be worth the drop

... that Julie Schmit-Albin was awarded the title of Admiral in the Great Navy of the State of Nebraska for her work as a pro-life activist?

Unfortunately, even the best-worded and hookiest hooks sometimes fall prey to the demographic and cultural biases of Wikipedia's reader base. Our readers are more likely to click on hooks that address sex, violence, high-profile American politics, internet culture silliness, wars, spats, cute animals, oppression, and vulgarity – things you might find in a tabloid, a true crime drama, or "Ripley's Believe It or Not!". They're less likely to enjoy the meat and potatoes of DYK, which often amount to obscure people, places, and things that might have good stories, but often don't immediately relate to experiences and cultures the reader is familiar with through mass media.[j] That's one of the biases that affects a DYK hook's pageview counts, and dealing with it appropriately can be quite the challenge.

When I'm at this step, it's likely that I think the hook passes the big three questions that define how intriguing a hook is for me – so instead, I think about what the reader might walk away with, and whether that's information that's worth imparting to the thousands of people who will see the article that day. So, if I'm looking at a hook that I think is well-worded and well-structured, but lacks the familiarity a reader might want, I usually try to push it through anyway. It's good to broaden our readers' horizons, even if they have to work a little harder to understand the story. Inversely, if I'm looking at a hook that I'm sure will perform well with our readership, but I also doubt that reading the context of the hook or even reading the article in full will be worthwhile, I often object to the hook being aired on the Main Page. Like I said earlier, I'm a fervent believer in the idea that DYK should act as a curator, and not simply be a megaphone. That is, of course, a subjective judgement; I try to consider about how the article might deepen their understanding of the world around them, even if it's in a small way. Usually, there's something worth saying.

Notes

  1. ^ Well, almost. The article on Puccinia porri is also the work of Esculenta, and their work is definitely fantastic – I wrote the boring hook, though, that's on me.
  2. ^ Unless you've got a truly absurd "quirky", in which case it's just being hooky, and possibly part of our April Fools' roundup. Usually, though, it's informative.
  3. ^ This is what you'll use to hone in on the punchier parts of the hook, should it become too long or unwieldy.
  4. ^ There are lots of literary devices that might help a hook clear this hurdle. A hook might leave a part of the story untold, a gap the reader wants filled; it might raise two seemingly contradictory points, a discrepancy the reader wants resolved; it might just promise a fun read. It might do many of these, or more, at the same time!
  5. ^ It goes without saying – at least, in the main text of this essay – that an article should deliver the goods if it's promised them in the hook.
  6. ^ The granular details of a hook are, admittedly, what stick out first when I'm reading it, but that's just a symptom of my reading style and not what's most important about a hook.
  7. ^ GalliumBot's vandyke protocol, which updates the stats pages, was developed by yours truly :)
  8. ^ Since not all hooks appear on the Main Page for the same amount of time, I'd recommend that you assess hooks by the number of views they receive adjusted for runtime (the stats pages use views per hour on the Main Page, or vph), rather than in total. It's also worth keeping in mind that DYK has increased in popularity over time, so a score of 175 vph is worth much more in 2016 than it is today.
  9. ^ Admittedly, I am kinda cheating by using an April Fools' hook, but it's empty all the same.
  10. ^ I'll note that, contrary to popular belief, pop music singles often fall in the latter category, not the former.




Tips and Tricks is a general editing advice column written by experienced editors. If you have suggestions for a topic, or want to submit your own advice, follow these links and let us know (or comment below)!



Reader comments

2023-02-20

Eden, lost.

Expulsion from the Garden of Eden by Thomas Cole, one of our newest featured pictures

This Signpost "Featured content" report covers material promoted from the last half of January (16-31). Quotes are generally from the articles, but may be abridged or simplified for length.

Well, it's been a busy period for me. Finally saw my dad again after three years of COVID, and, while I was there, had my house broken into, but my neighbours looked out for me and nothing was taken. I'm finally starting to relax, get things together, check the Signpost schedule and see we're publishing 9 days after the last issue. ...Sure. Can work with that. Kinda. There are, at least, far fewer featured articles than last issue. I've borrowed two article descriptions from The Bugle to make it a little easier to summarise complex battles. (And then we delayed publication for nearly a week. Well, you get what you get.)

Adam

Featured articles

Eight featured articles were promoted this period.

The Battle of Lake Trasimene by Joseph-Noël Sylvestre
Battle of Lake Trasimene, nominated by Gog the Mild
Another in Gog's series on the Punic Wars, his succinct description being "Hannibal arrives in Roman Italy and inflicts 100% casualties on a Roman army in a single day. A defeat simply begging for an adjective." It took place when Hannibal's Carthaginians ambushed the Romans under Gaius Flaminius in June 217 BC, on the north shore of Lake Trasimene, during the Second Punic War. With the Carthaginians attacking unexpectedly from the flank and the rear, there was no chance for the Romans to form even a rudimentary fighting line and they were defeated after three hours' hard fighting. The trap failed to close on the 6,000 Romans at the front of the column, who escaped; later that day they were surrounded by pursuing Carthaginians and surrendered. Thus all 25,000 Romans in Flaminius's army were killed or captured. This destruction of an entire army as a result of such an ambush is widely considered a unique occurrence. For good measure, a few days later the Carthaginians wiped out the entire cavalry force of a second Roman army.
1937–38 Gillingham F.C. season, nominated by ChrisTheDude
During the 1937–38 English football season, Gillingham F.C. competed in the Football League Third Division South, the third tier of the English football league system. The team won only three times in nineteen Football League matches between August and December; in November and December they played six league games and lost every one without scoring a goal, leaving them bottom of the division at the end of 1937. Although Gillingham's performances improved in the second half of the season, with seven wins between January and May, they remained in last place at the end of the season, meaning that the club was required to apply for re-election to the League. The application was rejected, and as a result the club lost its place in the Football League and joined the regional Southern League.
High School Musical: The Musical: The Series, nominated by SatDis
High School Musical: The Musical: The Series is an American mockumentary musical drama television series created for Disney+ by Tim Federle, inspired by the High School Musical film series. Set at a fictionalized version of East High School, the school at which the original movies were filmed, the first season follows a group of teenage theater enthusiasts who participate in a staging of High School Musical: The Musical as their school production.
Simonie Michael, nominated by Astrophobe
Simonie Michael (Inuktitut: ᓴᐃᒨᓂ ᒪᐃᑯᓪ; 1933 – November 15, 2008) was a Canadian politician from the eastern Northwest Territories (later Nunavut) who was the first Inuk elected to a legislature in Canada. Before becoming involved in politics, Michael worked as a carpenter and business owner, and was one of very few translators between Inuktitut and English. He became a prominent member of the Inuit co-operative housing movement and a community activist in Iqaluit, and was appointed to a series of governing bodies, including the precursor to the Iqaluit City Council. After becoming the first elected Inuk member of the Northwest Territories Legislative Council in 1966, Michael worked on infrastructural and public health initiatives. He is credited with bringing public attention to the dehumanizing effects of the disc number system that was used in place of surnames for Inuit, and with prompting the government to authorise Project Surname to replace the numbers with names.
Constantine (son of Basil I), nominated by Iazyges
Constantine (Greek: Κωνσταντῖνος; born between 855 and c. 865, died 3 September 879) was a junior Byzantine emperor, alongside Basil I as the senior emperor, from January 868 to 3 September 879. Constantine was made co-emperor by Basil in c. January 868. Constantine was the intended heir of Basil and as such received much attention from him and accompanied him on military campaigns, including one in Syria, for which he shared a triumph. In comparison, his younger brother, Leo VI (r. 886–912), was made co-emperor merely to secure the imperial lineage and bolster legitimacy. However, Constantine died of fever on 3 September 879, before his father. After his death, Leo became the primary heir, and another brother, Alexander (r. 912–913), was raised to co-emperor.
2022 World Snooker Championship, nominated by HurricaneHiggins and Lee Vilenski
The 2022 World Snooker Championship was a professional snooker tournament that took place from 16 April to 2 May 2022 at the Crucible Theatre in Sheffield, England, the 46th consecutive year the World Snooker Championship was held at the venue. The 16th and final ranking event of the 2021–22 snooker season, the tournament was organised by the World Snooker Tour and sponsored by sports betting company Betfred.
When Megan Went Away, nominated by Bobamnertiopsis (a.k.a. Colin)
When Megan Went Away is a 1979 children's picture book written by Jane Severance and illustrated by Tea Schook. It is the first picture book to include any LGBT characters, and specifically the first to feature lesbian characters, a distinction sometimes erroneously bestowed upon Lesléa Newman's Heather Has Two Mommies (1989). The book, published by the independent press Lollipop Power, depicts a child named Shannon dealing with the separation of her mother and her mother's partner, Megan.
As a lesbian working in a feminist bookstore in Denver in her early twenties, Severance sought to rectify the lack of picture book content for children with lesbian parents. When Megan Went Away was not widely distributed upon publication although the text of the story was republished by the magazine Ms. in 1986 under the pen name R. Minta Day. The work proved divisive among critics, some praising the story for being an anti-sexist example of lesbian life and others finding its depiction of same-sex separation poorly timed, arriving at a moment when lesbian motherhood was on the rise. Copies of When Megan Went Away were primarily accessible in archives and library special collections as of the 2010s.
Great Gold Robbery, nominated by SchroCat
The Great Gold Robbery took place on the night of 15 May 1855, when a routine shipment of three boxes of gold bullion and coins was stolen from the guard's van of the service between London Bridge station and Folkestone while it was being shipped to Paris. The robbers comprised four men, two of whom—William Tester and James Burgess—were employees of the South Eastern Railway (SER), the company that ran the rail service. They were joined by the planners of the crime: Edward Agar, a career criminal, and William Pierce, a former employee of the SER who had been dismissed for being a gambler.
During transit, the gold was held in "railway safes", which needed two keys to open. The men took wax impressions of the keys and made their own copies. When they knew a shipment was taking place, Tester ensured Burgess was on guard duty, and Agar hid in the guard's van. They emptied the safes of 224 pounds (102 kg) of gold, valued at the time at £12,000 (approximately equivalent to £1,416,000 in 2023), then Pierce and Agar left the train at Dover. The police and railway authorities had no clues as to who had undertaken the theft, and arguments ensued as to whether it had been stolen in England, on the ship crossing the English Channel, or on the French leg of the journey.
When Agar was arrested for another crime, he asked Pierce to provide Fanny Kay—his former girlfriend—and child with funds. Pierce agreed and then reneged. In need of money, Kay went to the governor of Newgate Prison and told him who had undertaken the theft. Agar was questioned, admitted his guilt and testified as a witness. Pierce, Tester and Burgess were all arrested, tried and found guilty of the theft. Pierce received a sentence of two years' hard labour in England; Tester and Burgess were sentenced to penal transportation for 14 years.

Featured lists

Nine featured lists were promoted this period.

Zayn Malik
List of awards and nominations received by Zayn Malik, nominated by Harushiga
English singer Zayn Malik has garnered numerous awards and nominations throughout his career, which began in 2010 through the formation of the boy band One Direction. Malik received two Pop Awards at the BMI London Awards for co-writing the band's songs "Story of My Life" and "Night Changes". He left the group in 2015 and signed a solo record deal with RCA Records. He began his solo career in 2016, which led to many more successes.
List of lilioid families, nominated by Dank
The lilioid monocots are a group of 33 interrelated families of flowering plants. They generally have tepals (indistinguishable petals and sepals) similar to those on the true lilies (Lilium). Like other monocots, they usually have a single embryonic leaf (cotyledon) in their seeds, leaves with parallel veins, scattered vascular systems, flower parts in multiples of three, and roots that can develop in more than one place along the stems.
List of Los Angeles Chargers starting quarterbacks, nominated by Harper J. Cole
The Los Angeles Chargers are an American football franchise who play in the National Football League (NFL). They began play in Los Angeles in 1960 as charter members of the American Football League (AFL), switched cities to San Diego the following season, and returned to Los Angeles in 2017. The AFL was formed as rivals to the established NFL, though the leagues would later merge, with all AFL teams including the Chargers officially joining the NFL in 1970.
List of World Heritage Sites in Sri Lanka, nominated by Tone
We've talked about World Heritage sites a lot, because a lot of work has gone into them, largely thanks to Tone. As of 2022, Sri Lanka has eight sites on the list. The first three sites, the Ancient City of Polonnaruwa, the Ancient City of Sigiriya, and the Sacred City of Anuradhapura, were listed in 1982. The most recent site, the Central Highlands of Sri Lanka, was listed in 2010. The Central Highlands and the Sinharaja Forest Reserve are natural sites, the other six are cultural. In addition, Sri Lanka has three sites on its tentative list.
List of Hot R&B Sides number ones of 1959, nominated by ChrisTheDude
Another of our series on the Billboard record charts, featuring the most popular music in various genres for each year.
Paul Rudd on screen and stage, nominated by Lady Lotus
Paul Rudd is an American actor, comedian, writer, and producer. His career began in 1992 when he played a recurring role in the television series Sisters until 1995. In 1995, he made his film debut opposite Alicia Silverstone in the cult film Clueless, and starred as Tommy Doyle in Halloween: The Curse of Michael Myers. Rudd has frequently collaborated with filmmaker Judd Apatow in such comedy films as Anchorman: The Legend of Ron Burgundy (2004), The 40-Year-Old Virgin (2005), Knocked Up (2007), Forgetting Sarah Marshall (2008), This Is 40 (2012), and Anchorman 2: The Legend Continues (2013). Since 2015, he has played Scott Lang / Ant-Man in the Marvel Cinematic Universe, appearing in Ant-Man (2015), Captain America: Civil War (2016), Ant-Man and the Wasp (2018), and Avengers: Endgame (2019). He has also starred in the supernatural comedy film Ghostbusters: Afterlife (2021).
Lamium album. Lamium originates from the Greek for "wide-open mouth" (on the flowers).
List of lamiid families, nominated by Dank
The lamiids are a group of about 40 interrelated families of flowering plants. They include about 40,000 species, representing one-seventh of the total diversity of flowering plants, and about half of all asterid species. Like most asterids, they tend to have petals that are fused with each other and with the bases of the stamens, and just one integument (covering) around the embryo sac. In lamiids, the fusion of the petals tends to occur late in their development, and the bases of the petals are usually beneath the ovaries.
List of bibliographies of works on Catullus, nominated by Umimmak
Gaius Valerius Catullus (c. 84c. 54 BCE) was a Latin poet and a leading figure of the Neoterics. Catullus and his poetry, comprising 113 poems, have been the subjects of many books and papers in classical studies and other fields, including literary criticism, gender studies, and cultural studies; there are many critical editions, commentaries, translations and student guides of his poetry as well. Even in 1890, Max Bonnet wrote that Catullus was "inundated" with academic publications concerning his life and works.
List of National Football League annual receptions leaders, nominated by Newtothisedit
In American football, passing, along with running (also referred to as rushing), is one of the two main methods of advancing the ball down the field. Passes are typically attempted by the quarterback, but any offensive player can attempt a pass provided they are behind the line of scrimmage. To qualify as a passing play, the ball must have initially moved forward after leaving the hands of the passer; if the ball initially moved laterally or backwards, the play would instead be considered a running play.
The Finding of the Saviour in the Temple by William Holman Hunt, another of this fortnight-and-a-bit's featured pictures



Reader comments

2023-02-20

Love is in the air



Reader comments

2023-02-20

Superbowl? Pfft. Give me some Bollywood! Yours sincerely, the world

This traffic report is adapted from the Top 25 Report, prepared with commentary by Igordebraga, Benmite, Max BuddyRoo, and SSSB.

Come, Mr. DJ, song pon de replay (February 12 to 18)

Rank Article Class Views Image Notes/about
1 Rihanna 4,527,454 After five years, the greatest thing to ever come out of Barbados returned to the stage in a 13 minute concert in Glendale, Arizona. Sure, sometimes she was taking advantage of the playback to not actually sing. But between the sequence of hits (plus some Kanye West covers) and that impeccable stage presence, with the only special guest being the one in her womb (only 9 months after she gave birth!), the world's reaction to Rihanna was "I just can't refuse it, like the way you do this, keep on rockin' to it, please don't stop the, please don't stop the music!"
2 Raquel Welch 2,729,570 An icon of beauty, eternized in a best-selling poster wearing the fur bikini off One Million Years B.C. (you know the image, it's even been used in a prison escape), who has passed away at 82.
3 Valentine's Day 2,707,459 The holiday for the lovers. That this year could be celebrated like it was 1998, given Titanic was back in theaters!
4 Patrick Mahomes 2,318,975 Ah yes, before and after Riri's concert there were some guys passing around an oval thing. This guy led the team who won the game, the Kansas City Chiefs.
5 Ant-Man and the Wasp: Quantumania 2,030,180 Marvel Cinematic Universe: Phase Five has begun with size shifters Scott Lang and Hope Van Dyne fighting Kang the Conqueror, bound to become the newest big bad that requires the Avengers to defeat. And it seems like again every new MCU movie will inspire rifts among the fanbase, with a mixed response mostly finding that a microscopic epic was not as fun as the character's preceding solo outings that focused on the comedy. Still, expect big box office numbers and a few more appearances on this list.
6 ASAP Rocky 1,532,735 #1's 'surprise guest' turned out to be another baby, and it's this rapper's fault.
7 ChatGPT 1,464,968 Wonder if I should make an account for this still very discussed chatbot, if only to see what it would write on the subjects that enter the Report? (for certain it would not try to pass music lyrics as commentary)
8 Pathaan (film) 1,455,834 India is still raving about this Bollywood thriller, now the country's fifth highest-grossing movie ever .
9 The Last of Us (TV series) 1,398,207 Knowing it was hard to compete with #1 (OK, technically #4, but more people wanted to see her), HBO kept the Sunday broadcast of this post-apocalyptic video game adaptation while releasing the episode two days earlier on streaming.
10 Travis Kelce 1,275,119 One of #4's teammates, who defeated his older brother in the big game.

Living together, growing together, just being together (February 5 to 11)

Rank Article Class Views Image Notes/about
1 Pathaan (film) 2,115,794 Along with making all the crore, Bollywood's biggest hit of the year has also managed to keep the top of this list for three straight weeks. It only seems improbable to pull a fourth with Super Bowl LVII right around the corner.
2 ChatGPT 1,871,580 No big new developments when it comes to this all-in-one chatbot that can do just about everything but become human – but let's not jinx it. Still, it's generated more chatter about it than even it could produce, hence its steadily rising position on this list.
3 The Last of Us (TV series) 1,658,357 HBO's TV adaptation of what is commonly believed to be one of the greatest, if not the greatest, video games of all time is now on its way to being hailed as one of the greatest series of all time if IMDb ratings are to be believed. The week had two episodes because HBO chose to anticipate one to Friday rather than take their chances against the Super Bowl. I haven't gotten around to watching it myself, but it has a bittersweet gay romance and Anna Torv – two of my favorite things – so I should probably get on that.
4 Kiara Advani 1,538,943 This popular Indian actress, who's starred in two movies on the list of highest-grossing Indian films, got married to #9 this week.
5 2023 Turkey–Syria earthquake 1,492,216 Syria has already been devastated by over a decade in a civil war, and things were worsened once neighbor Turkey was struck by the deadliest earthquake since the 2010 Haiti earthquake, leaving widespread damage and tens of thousands of casualties in both countries, not helped by a winter storm covering the rubble in snow.
6 Pedro Pascal 1,284,539 Stars in #3 as Joel, the protagonist of both the series and its source material. And also hosted Saturday Night Live.
7 Burt Bacharach 1,272,831 Bacharach's might be a name you don’t recognize if you’re below a certain age, but rest assured that even if you haven’t heard of him, you’ve heard him – that is, you’ve heard one of the countless hit songs he wrote from the ‘60s all the way to the ‘90s that were performed countless times by countless artists before his death this week at age 94.
8 Kim Petras 1,241,199 Pop singer Petras made what I suppose you could call history at the Grammys if you consider Best Pop Duo Performance to be the most essential award of the night. Petras became the first trans person to win that award, but not, as many online might deceive you into believing, the first trans person or even trans woman to win a Grammy. That honor belongs to electronic music pioneer Wendy Carlos, who won three Grammys all the way back in 1970 for her debut album Switched-On Bach.

Petras may not have the same creds as Carlos, but she made sure to thank the trans women who came before her to make her win possible. Absent from her victory speech, though, was an acknowledgment of her continued work with and fervent support of Dr. Luke, the 2010s hitmaker accused of sexually assaulting fellow pop star Kesha. Talk about unholy.

9 Sidharth Malhotra 1,105,868 Got married to #4, who he met while filming Shershaah in 2021 – where, in a case of life imitating art, they were playing lovers.
10 Deaths in 2023 972.921 Here's one of #7's songs you might've heard (even Spider-Man likes that one):
But there's one thing I know
The blues they send to meet me
Won't defeat me, it won't be long
Till happiness steps up to greet me

God I love the sweet taste of India (January 29 to February 4)

Rank Article Class Views Image Notes/about
1 Pathaan (film) 3,980,500 This Bollywood film continues to make waves in India as it's already one of the highest grossing Indian films of all time, bringing this and #10 into the Top 25.
2 The Last of Us (TV series) 1,983,197 The HBO show based on #7 continues to make the top 2 in the report. And this week's episode revealed Ron Swanson was one of the survivors of the apocalypse!
3 ChatGPT 1,788,228 The chatbot AI is on the list again, with an epidemic of kids using it to cheat on their schoolwork. Recently, someone made an app to detect GPT plagiarism, so we will see where it ends up in the coming weeks. Some other popular uses for it are a site that mimics characters (and making them gain sentience), and (failed) attempts to pass the bar exam.
4 Royal Rumble (2023) 1,773,416 The WWE PPV event made the report again this week, with positive reception, and a YouTuber involved in a crypto scam.
5 Annie Wersching 1,694,774 An actress best known for TV work like 24, Bosch and Runaways (and who was also part of #7's cast), who died at 45 of cancer.
6 Pamela Anderson 1,293,055 Last year Hulu's Pam & Tommy dramatized how this sex symbol's sex tape was leaked. Now Netflix has released Pamela: A Love Story, a documental miniseries where Pam recalls her life story, serving as a companion piece to her autobiography Love, Pamela.
7 The Last of Us 1,037,352 #2's source material, originally released in 2013 for the PlayStation 3.
8 Deaths in 2023 1,024,280 We were caught up and lost in all of our vices
In your pose as the dust settled around us...
9 Cindy Williams 990,960 Laverne & Shirley are reunited in the afterlife, as five years after Penny Marshall death came for Cindy Williams, who before Happy Days and that spin-off appeared in the movies American Graffiti and The Conversation.
10 List of highest-grossing Indian films 979,151 While the global list has Avatar: The Way of Water entering the top 5, this country-specific one sees #1 already in the top 10.

Exclusions

  • These lists exclude the Wikipedia main page, non-article pages (such as redlinks), and anomalous entries (such as DDoS attacks or likely automated views). Since mobile view data became available to the Report in October 2014, we exclude articles that have almost no mobile views (5–6% or less) or almost all mobile views (94–95% or more) because they are very likely to be automated views based on our experience and research of the issue. Please feel free to discuss any removal on the Top 25 Report talk page if you wish.



Reader comments

2023-02-20

5, 10, and 15 years ago: Let's (not) delete the Main Page!

Going to do this one in reverse from how we normally do it, because, frankly, there wasn't that much going on five years ago, and a whole lot to say about fifteen years ago.

Fifteen years ago: February 2008

Citizendium, Larry Sanger's first attempt at a Wikipedia rival (he would go on to make many more), was doing fairly well at the time, but mismanagement would eventually kill its momentum. Meanwhile, in the wake of various incidents that caused chaos, it was made impossible to delete the Main Page of Wikipedia. Don't test this out. Seriously, don't. Though you might get a special barnstar if you do:

The Destroyer of the Wiki Barnstar
OH MY GOD WHAT DID YOU DO THERE'S MONKEYS AND FISH EVERYWHERE SOMEONE HELP

Finally, on a more serious note, this was also the point where controversy over Wikipedia hosting images of Muhammad reignited. There had been some controversy in 2006 when the Jyllands-Posten Muhammad cartoons controversy hit the main page, but this was the start of demands that all such cartoons be removed. In this issue's Cobwebs, we finally publish an article from 2010 about the point where we lost the moral high ground on such issues.

Ten years ago: February 2013

Lovers of regular Signpost segment the Traffic Report will find its presumptive origins in a Special Report in the 4 February 2013 issue that analysed some of the interesting trends and spikes. The film Star Trek[:] [I/i]nto Darkness caused pretty much the expected amount of controversy as to how it should be capitalised. A list of Wikipedia hoaxes provides all the fun reading one would expect.

Looking in more detail, portals were still a big deal back then; they're fairly moribund now: Featured portals would close in 2017, and there was an attempted purge in in roughly 2019. Why? It's hard to say. Perhaps as Wikipedia's search tools improved, portals ceased to be a key way to find articles; perhaps it has to do with the general decline of WikiProjects. Portal:Opera is still doing great as a way to show off content in WikiProject Opera, but there's certainly a lot fewer WikiProjects than we once had. I still remember when, if you wanted help on basically any subject, you'd just find the relevant WikiProject and ask them.

Five years ago: February 2018

We'd be remiss not to include this wonderful illustration by Jon Robson from the Humour section:

We also had a short (but pretty decent) interview with The Rambling Man, who, at least at the time, was Wikipedia's top contributor of Featured Lists. A sample:



Reader comments

2023-02-20

Editorial: The loss of the moral high ground

This article was originally meant to be published in the 26 July 2010 issue, but got pulled due to fears of controversy. In this new feature, we're going to republish some of the more interesting of the Signpost's lost articles. The views in this article are solely those of the author, and do not necessarily reflect those of the Wikipedia Signpost or the Wikimedia Foundation as a whole.
The image of Muhammad singled out in the 2008 petition. 17th-century copy of a 14th-century original. Deemed acceptable for Wikipedia.

As covered in last week's Signpost, the Acehnese Wikipedia has erupted in controversy over images of Muhammad hosted on Commons.

This is by no means a new debate. In 2006, the article on the Jyllands-Posten Muhammad cartoons controversy was featured on the main page, and caused significant controversy for including the cartoons. In 2008, a petition to delete images of Muhammad circulated. However, as long-standing policy stated, like Wikipedia itself, our image host, Wikimedia Commons, is not censored. That it could cause offense was not sufficient to remove an image.

Amongst those advocating for this view was Jimbo Wales. cited his free speech advocacy, and stated that "we can not deviate from our goals to accommodate [those governments who would force Wikipedia to be censored]."

However, the current petition comes after we have lost some of the moral authority we once had.

An artwork personally deleted by Jimbo Wales: An art deco image of lesbians by noted illustrator Franz von Bayros.

Jimbo Wales, out of fear of a media attack, led by Fox News, about allegations of pornography on Wikipedia, instituted a massive deletion of content from Commons, including many examples of artwork by notable artists. For example, the artwork to the right, by Franz von Bayros, was personally deleted by Jimbo, who even edit warred to keep it deleted. On being challenged about this deletion, Wales wrote:


He also deleted several works by major artists, for example, File:Félicien_Rops_-_Sainte-Thérèse.png by Félicien Rops, and numerous line art illustrations used to illustrate articles on sexual content.

Only long after these deletions were done did he state his reasons:


It wasn't even effective: Fox News shortly thereafter posted an article attacking Wikipedia.

If Wikipedia is going to sacrifice its moral high ground and neutrality – for saying that things offensive to Fox News are worth mass deletion sprees including historic artwork, but that the complaints of Muslims are not, is highly non-neutral – we should not sell ourselves cheap. We recently did, and only the effective loss of all Jimbo Wales' powers over his actions leaves us any moral high ground at all. This petition is much harder to deal with, as we have shown that, yes, we will give into pressure – but only if it comes from our mainstream Western culture.



Looking back at this from 2023, it's hard to say what was learned: after Jimbo gave up most of his founder privileges, the matter kind of just... died out. Most of the images deleted were restored, though, unfortunately, deleting images means they get removed from articles, so whether all the article usages ever got dealt with is very unclear. Later in 2010, we reported, quoting a dead link that appears to be the same as or very similar to this article in the The Sydney Morning Herald:
I guess everyone just decided we should pretend all of this never happened?
I honestly think these events were important to Wikipedia's history, though: This was when WP:NOTCENSORED got tested. He even edit-warred to try and keep images deleted. Had Jimbo won out, years of saying that images of Muhammad shouldn't be censored would have blown up in our faces. Because Jimbo made us lose the high ground, but the hundreds of people who fought against him regained it.


Reader comments

2023-02-20

The RfA Candidate's Song

Back by popular demand: this little ditty, taken from Wikipedia:Songs about Wikipedia/The RfA Candidate's Song, was recorded with considerable panache and musicianship back in 2006 – which means there's probably a whole new generation of editors who've never heard it. Enjoy! Lyrics below.


RfA: A self-nomination (parody) (sung to the tune of the Major-General's Song)


I am the very model of a modern Wikipedian:
I've knowledge of Greek drama, both comedic and tragedian;
I know historic battlefields from Ancient Rome to World War Two,
and neutral is the mode in which I always keep my point of view!
I'm very well acquainted, too, with Polynese ethnology.
I understand Samoan lore and Mā-ori mythology.
From mountain peaks of Suriname to passes of the Great Divide,
I see to it that everything is quite correctly wikified!
He sees to it that everything is quite correctly wikified!
He sees to it that everything is quite correctly wikified!
He sees to it that everything is quite correctly wiki-wikified!
Some people think my entries are little more than trivia —
that episode of Barney or that woodwind of Bolivia
and yet, in matters great and small, high and low and median,
I am the very model of a modern Wikipedian.
In short, in matters great and small, high and low and median,
He is the very model of a modern Wikipedian!
Black and white picture of a general with a mop
I must become a wiki admin-candidate
I SIGN all of my comments; and give summaries for each edit;
I never BITE; I never POINT; I follow wiki etiquette;
I assume good faith when wiki-flamed, and wear a thickened wiki-skin.
For I am the very model of a proper wiki-citizen:
I'm often found at village pumps discussing rules and policy,
and judging notability while over at the AfD,
In matters wiki-sophic, I'm no exopede inclusionist,
but an incremento-eventuo-darwikian-delusionist!
an incremento-eventuo-darwikian-delusionist!
an incremento-eventuo-darwikian-delusionist!
an incremento-eventuo-darwikian-delusion-usionist!
I am the very model of a modern Wikipedian.
I wrote this wiki-parody to ease some wiki-tedium.
When dealing with my colleagues I'm a friendly wiki-editor,
but the sight of trollishness makes me a vandal-stalking predator;
When dealing with his colleagues he's a friendly wiki-editor,
but when provoked he can become a vandal-stalking predator!
Reverts and warnings I have made while on the recent change patrol,
I play some cat-and-mouse with them, but now I want to block-a-mole,
and so at last I must become a wiki admin-candidate:
since I'm my only fan, I have no choice but to self-nominate.
As prep for this I've learned about all bans and blocks and redirects,
now I can tell you anything on SEMIs and full page protects,
and when you hear all that I know of Wikipede sysop-ery,
you’ll say no better candidate has ever sought the moppery
you’ll say no better candidate has ever sought the moppery.
you’ll say no better candidate has ever sought the moppery.
you’ll say no better candidate has ever sought the moppa-moppery.
Now the adminship, as Jimbo says, is really no big deal at all:
many fail and so might I (my role in portal talk's been small).
But still in matters great and small, high and low and median
I am the very model of a modern Wikipedian.
In fact, in matters great and small, high and low and median,
He is the very model of a modern Wikipedian!



Reader comments

If articles have been updated, you may need to refresh the single-page edition.



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0