As reported here in July, India's Asian News International (ANI) has brought the Wikimedia Foundation to court. The allegation is of publishing defamatory content about ANI, at the English Wikipedia article, which stated at that time that they had "been accused of having served as a propaganda tool for the incumbent central government, distributing materials from a vast network of fake news websites, and misreporting events on multiple occasions". The Foundation is now being compelled by the Indian court to reveal personal information of some editors who have edited the article, according to Livemint (report) and The Hindu (report). The next hearing will be on 25 October 2024. Wikipedia's internal consensus of ANI's suitability as a citable source for articles (as in this 2021 discussion and WP:RSPANI) generally holds it to be somewhere between marginally reliable and generally unreliable for general reporting, prudent to give in-text attribution for potentially contentious claims, and generally unreliable in its coverage of domestic and international politics (and other topics that the government of India has a stake in). – rs
In European courts, on the other hand, things have been going better for the Wikimedia Foundation lately. In the UK, British-born Swiss lawyer Matthew Parish sued the WMF for libel, because the article about him (correctly) noted his legal issues; however, the case has been dismissed by High Court judge Karen Steyn.
And as highlighted by Techdirt ("The Wikimedia Foundation Successfully Sees Off Another SLAPP Suit, But More Protection Is Needed Globally"), WMF recently reported another legal victory in Germany ("Wikimedia Foundation defeats gambling magnate’s lawsuit in Germany"). The Foundation characterized it as having "had all the hallmarks of an illegitimate 'SLAPP' lawsuit: a strategic lawsuit against public participation. SLAPPs are lawsuits designed to force organizations and individuals to remain silent on legitimate matters of public interest. [...] The [German] Wikipedia article in question names Mladen Pavlovic as one of three co-founders of Tipico, a major European gambling company headquartered in Malta." According to the WMF, this was well-sourced public information. Yet "Pavlovic engaged a reputable German law firm to threaten the Foundation with legal action unless we agreed to censor the Wikipedia article. After consulting members of the German Wikipedia community, we refused the lawyers’ demand." Pavlovic then filed a lawsuit which "was especially intensive for our team because of the unusual number of legal briefs to which we were asked to reply. [...] These usually repeated earlier arguments, and introduced—in our opinion—increasingly thin or irrelevant new ones." As summarized by Techdirt, "that approach seems a conscious attempt to deplete Wikimedia’s limited financial resources [for legal defense], increasingly under strain" from what the WMF blog post describes as a changing legal environment:
The Foundation’s legal team now also has to deal with a wave of new and very demanding “online safety” laws across the world: for example, the EU Digital Services Act (DSA) and the UK Online Safety Act. These conditions force us to be as efficient, creative, and effective as possible, including in lawsuits like this one.
These laws may not have directly affected Pavlovic's lawsuit yet, as it predates the DSA. However, according to WMF, it fits a resource-draining pattern: "The Foundation faces several SLAPP-like cases each year" (citing examples including the still unresolved Caesar DePaço lawsuit in Portugal, see previous Signpost coverage). The Foundation's post ends with a call for anti-SLAPP reform (about which, according to Techdirt, "Some progress has already been made" in the EU and UK). However, it reiterates that insufficient anti-SLAPP protection is only part of the legal challenges affecting Wikimedia projects, briefly noting other concerning developments:
Privacy-infringing laws like France’s data retention law, and emerging online identity requirements, together with laws that give government authorities insufficiently regulated powers to order content takedowns, are also a significant issue.
– H
Wikipedia beat reporter Stephen Harrison, who is best known for his articles in Slate, has recently been busy promoting his debut novel, The Editors, focused on a fictionalized version of the platform (named "Infopendium") that is suddenly caught up in global cyberwarfare during the COVID-19 pandemic — see previous coverage from the Signpost here and here.
Now, though, he has written an article for The Guardian detailing his view on the future of Wikipedia, which is subtitled "The world's most important knowledge platform needs young editors to rescue it from chatbots – and its own tired practices". Harrison says Wikipedia is currently facing an "existential crisis" due to the emergence of AI applications and large language models, which could potentially undermine the platform's visibility. According to Harrison himself, Gen Z editors are the best-equipped to help Wikipedia survive and, possibly, even thrive in this new context: he pointed out a 2022 survey reporting that about 20% of Wikipedia editors were between the ages of 18 and 24, while also noting the role of young contributors in recent debates on the incorporation of chatbot-generated content on the encyclopedia. The article notably includes a short interview with a very prominent Gen Z editor: the latest Wikimedian of the Year, Hannah Clover.
As for those "tired old practices", Harrison has his say about the sometimes inflexible norms and normalizing institutions of Wikipedia, not to mention mobile-unfriendly editing interface, which he calls "issues that dissuade the younger generation from joining the cause". For instance, he says that the tasks taken on by new editors from a decade ago – ones letting them dip their toes in the editing experience in a low-risk, low-consequence environment – are now more highly automated, leaving a lack of "clear entry points". This in turn may lead today's new editors to unknowingly get into contentious topics where they experience off-putting "harsh feedback" from the more established editors. Harrison left unsaid that there are more contentious topics and areas under sanctions than ever before (see prior Signpost coverage that noted "policies of closure and the formalization of boundaries, rules and routines").
Whether the new generation can adapt to, or reform the tired Wiki, and eventually make it their own as they become the normies, or whether they abandon it for something new, only time can tell. – O, B
Joshua Yaffa in The New Yorker explains (paywalled) the difficulties Marlene Engelhorn had in giving away 25 million euros through the Guter Rat für Rückverteilung (Good Council for Redistribution). Engelhorn had inherited her money from a fortune that started with the founding of BASF and later grew with the Boehringer Mannheim pharmaceutical company. She felt that she should give away most of it to reduce wealth inequality in Austria and as a learning experience to guide others who have the same goal. Engelhorn was keeping about 10% of her money and about €3 million was spent on implementing a process where a citizen council – a group of 50 ordinary Austrians selected by lottery – decided where the money should go. This included the use of moderators who "wield huge power" according to an academic who studies this area. They have "an emphasis on getting things done ... it can all mean that, in the moment, you take away the possibility for improvisation or dissent.”
Nearly eighty organizations were selected by the council, with an average of €312,500 for each organization. "Wikipedia" (as they called the Wikimedia Foundation) turned out to be the most controversial choice:
The idea came from a Vienna resident in his mid-thirties [...]. He saw Wikipedia as addressing many of the council’s core values: democracy, accessibility, transparency. The idea was immediately opposed by Kyrillos, a high-school student and the council’s youngest member. “We have a lot of other, more important issues to address here,” he said. Anyway, he went on, his teachers wouldn’t allow him to use Wikipedia as a source in his papers—why give it money?
Factions emerged. Some saw Wikipedia, a nonprofit based in the U.S., as an inefficient use of the council’s resources. Others viewed the effort to nix it as a violation of the council’s ground rules. [...]
The Wikipedia debate was ultimately settled with a compromise. The members of the education group agreed to give the organization fifty thousand euros, a small portion of their total.
Thanks Marlene!
The name "Engelhorn" may ring a bell for longtime Signpost readers. In 2015, the Reiss Engelhorn Museum in Mannheim, Germany, filed lawsuits against the Wikimedia Foundation, Wikimedia Deutschland and a Wikimedia Commons user over the use of photographs of public domain artworks on Wikimedia projects. (Cf. Signpost coverage: "Wikimedia Foundation, Wikimedia Deutschland urge Reiss Engelhorn Museum to reconsider suit over public domain works of art", "Wikimedia lawsuits in France and Germany". While the museum prevailed in court against the Foundation, the EU Copyright Directive subsequently made such assertions of copyright over faithful reproductions of public domain works impossible.) Indeed the museum is so named after one of its sponsors, German industry titan Curt Engelhorn (1926–2016), a relative of Marlene Engelhorn. As detailed in the German Wikipedia article about him, back in 1997, in what was Europe's largest company takeover to date, he had controversially managed to sell off the family's company holdings for 19 billion DM without paying any taxes to the German state, and Marlene Engelhorn has publicly criticized his (lack of) philanthropy.
In the wake of a high-profile sexual assault in Kolkata last month, India's courts have demanded that Wikipedia remove of the name of a victim from an article on the crime. While some national and local media outlets reported the name of the victim at the time, as did various international media sources, the laws of India, prohibit media from publicizing the names of victims of especially heinous crimes.
The incident was widely reported on 10 August, the day after it occurred; English Wikipedia editors created the article a day after that. By 13 August, editors began debating whether to mention the victim's name. On 16 August, that debate became part of a discussion on how to title the article, which was ongoing until 9 September, when the disagreement about including the name split into its own discussion. All of this was part of the consensus-building process usual and familiar to editors, where editorial decisions are reached by group discussion about how to implement policies and guidelines.
On 16 September, the Supreme Court of India ordered Wikipedia to remove the name. The Free Press Journal (report), The Hindu (report), and The Times of India (report) are among the many sources to report on the court's order. In this case, and as often happens when institutions make requests of Wikipedia, the court made its request with some presumption that Wikipedia has an editorial leader who can issue binding orders. As is known among Wikipedia editors, there is no such person: even the Wikimedia Foundation does not control the content of articles under Wikipedia's policies.
In response to the court decision, the legal department of the Wikimedia Foundation posted a notice on the talk page of the article, encouraging Wikipedia editors to deliberate carefully on the issue and "explain clearly why you feel the balance of interests lies one way or the other, in order to reach consensus accordingly". Wikipedia editors did that, and reached a decision to exclude the victim's name. User:Tamzin, a volunteer editor, closed the discussion and authored the consensus statement.
The final decision was to exclude the name from the article. While closing statements typically are descriptions written by Wikipedia editors and for Wikipedia editors, Tamzin included a summary explanation of the overall process in anticipation that the court, media, and public observers may wish to examine both the consensus and discussion. It will not come as a surprise to Wikipedia editors that Wikipedians value their editorial independence. The closing statement emphasizes that Wikipedia editors did not arrive at that decision at the behest of the court, but rather because community deliberation found reasons for doing so and because a supermajority of editors supported the decision.
The article is 2024 Kolkata rape and murder incident, and it begins, "On 9 August 2024, a 31-year-old female postgraduate trainee doctor at R. G. Kar Medical College and Hospital in Kolkata, West Bengal, India, was raped and murdered in a college building."
One set of arguments about the name relates to victims' rights and women's rights. The argument in favor of naming the victim is that her story becomes known and enables activism to reduce violence against women. The argument opposed is that in some cases, and this case in particular, naming the victim greatly endangers and disturbs their family, social network, colleagues, and supporters.
Another set of arguments relates to censorship of Wikipedia and Wikipedia's own WP:NOTCENSORED policy. The argument in favor of publishing the name is that maximal freedom in publishing is the preferred position. The argument opposed is that naming the victim is not a censorship issue, as Wikipedia will definitely have an article on the crime, and that article does not benefit significantly by including the name of the victim.
Another set of arguments is about following the lead of what other media outlets do. Arguments in favor of publishing the name point to seeming WP:Reliable sources and reputable journalists who are publishing the name. Arguments opposed to publishing the name make various claims, including that sources publishing the name are mistaken, or that they have since removed the name, or that the higher quality sources do not publish the name while lower quality sources do. Wikipedia editor User:Fowler&fowler checked various sources and reported which ones do not publish the name.
A final set of arguments is on the practicality of collaboration between Wikipedia and the government of India. The argument in favor of publishing the name assumes that other arguments establish that Wikipedia editors should publish the name, and in that context, it is best for Wikipedia as an international media source outside the jurisdiction of Indian government control to disregard the government request. Arguments opposed to publishing the name include respect for the expertise of those courts, respect for national decision making to know what is best for local culture, anticipation of a good future of peaceful collaboration with the government of India by granting this request, and concern for the burden on Wikipedia editors in India if they bear the responsibility of an online global decision including non-Indian Wikipedia editors.
– BR
In 2011, the Australian Paralympic Committee (now Paralympics Australia) commenced a project to document its history. This included collecting documents and museum pieces and conducting oral history interviews with Paralympians. An online component was recognised as being important, and Wikipedia was identified as part of that. Since then, Paralympics Australia and Wikimedia Australia have collaborated to produce thousands of articles that keep receiving millions of page-views each year. As part of the project, I attended the Paralympic Games in London in 2012, in Rio de Janeiro in 2016, and now in Paris in 2024 as a media representative, with accreditation supplied by Paralympics Australia. This time, I took a photographer, GailLeenstra, with me.
Media accreditation meant that I had access to the media tribunes at the venues and could attend any game, even when the event was sold out (as was usually the case). It meant that I could visit the Paralympic Village and interview athletes after the game in what is called the Mixed Zone. It meant that we could use the buses of the TC, the Olympic transport system. It meant we had access to the resources of the Main Press Centre (MPC) and the Venue Media Centres (VMCs), which provided wired and wireless internet access, desks to work at, staff to help us, and lockers to store our equipment. It meant that my photographer had access to prime photographic positions not accessible to the public. It also meant that she had access to the Nikon store at the Stade de France, where she was able to get some of her equipment repaired and borrow some very expensive equipment for the duration of the games to supplement the gear she had brought with her from Australia – all for free.
The image of Wikipedia has undergone a dramatic transformation in the time I have been working on the Australian Paralympic Project. In London in 2012, there was a tendency of the mainstream media to regard us as not being "real journalists". There was none of that in Paris, quite the opposite in fact; mainstream media representatives repeatedly told us how much they appreciated our efforts, how they used Wikipedia as a reference all the time, and how impressed they were with its accuracy.
Support from Paralympics Australia did not end in Australia. In Paris, they had set up headquarters at a site near the Paralympic Village known as "Our Mob", which contained meeting rooms, a TV studio, dining room and a McCafé concession (McDonald's being one of their sponsors). Tim Mannion, the General Manager of Communications, gave generous and welcome assistance and support to our efforts, including providing passes to the opening and closing ceremonies. Unlike the Olympic opening ceremony, the Paralympic opening ceremony was held in beautiful weather. Some 65,000 spectators packed into the Place de la Concorde for the first ever Paralympic opening ceremony to be held outside a stadium. GailLeenstra was one of a select group of photographers chosen to accompany the lighting of the Paralympic cauldron.
In Sydney, London and Rio, multiple venues were concentrated in a multi-sport precinct, but in Paris, the venues were widely scattered around the city. This is a model considered by many cities planning to hold the Olympics and Paralympics, because it allows the city to make use of existing facilities, saving the substantial cost of building new ones. It is not cheap, however! It came at a substantial cost in increased security, transportation and manpower through duplication. Venues required considerable upgrades, refurbishment and fitting out for the games. Three new venues had to be built, and Paris Metro lines were extended. Not to mention the 1.6 billion euros spent on cleaning up the Seine and Marne to make them fit to swim in. The police presence was overwhelming, with about 45,000 police and 10,000 troops on hand. All were heavily armed, with automatic weapons in case Hamas decided to put in an appearance. Roads near the venues were closed to vehicle traffic.
Getting from one venue to another involved a trip on the Paris Metro using the Navigo cards issued to us as part of our media kit. Each day we criss-crossed the city on the Metro as we moved from one venue to the next. Fortunately, the Metro was super-efficient, with trains leaving every couple of minutes. Getting to venues in the metropolitan area took about a half an hour. (The locals told us that the Metro had never been so efficient nor, with the enhanced police presence, had they ever felt safer.) This meant that each day started with critical decisions about what events we would cover that day. Priority was given to events with Australian participation (especially medal chances), since our media accreditation was so generously provided by Paralympics Australia, but the athletes of other countries (especially the English-speaking ones) were by no means neglected. A mobile phone app told us which trains and buses to take to get from one place to another. It knew the location of the venues, train stations and bus stops, the bus and train schedules, and how crowded they were, encouraging you to take less crowded services.
As it turned out, there were some other foreign Wikipedians present, but they lacked our accreditation and (quite understandably) had different priorities. This meant that Wikipedia had broad coverage and having Australian Wikipedians on site was fully justified by the coverage. We tried to see as many sports as possible: our coverage included Boccia, Cycling, Equestrian, Paracanoe, Triathlon, Wheelchair Basketball and Wheelchair Rugby. Cycling and Equestrian events were located well out of town, requiring a day trip on the Metro, Réseau Express Régional, and the TC.
Between 23 August and 9 September, articles created by the History of the Paralympic movement in Australia project garnered a total amount of 2,226,684 page views, while more than 1,200 images were uploaded. These pages and pictures will be a lasting legacy, to be enjoyed by readers for years to come.
First, the bullet points:
As for how I got here: Vanamonde93 asked me if I was interested in running some time ago, but I deferred for a number of reasons that became more obviously hollow as time went on. For one, "I'm too busy" doesn't work as a very good excuse if you just go find something else to do, like starting a new kind of GAN backlog drive. My second nominator was an easy choice, but that doesn't mean I wasn't nervous about asking. (Don't laugh, czar!) Reading the nomination statements made my heart grow three sizes and also made me want to disappear into the wallpaper. (I'm not good at praise.)
For a little while, I was of a mind to wait out the end of the discussion-period trial, but once I'd got my nominators lined up, it started to feel like I just needed to get it over with. Then, right after I'd said, "alright, now's good, let's do this," Femke showed up in my inbox asking if I'd ever thought of running. It felt so good to be able to tell someone, "you're just in time!" instead of brushing them off with some kind of excuse.
I tried my level best to ignore all the discussion and voting while it was happening, but it turns out that's really hard – not just because it takes willpower, but also because MediaWiki kept showing me the discussion section whenever I tried to preview my answers to the questions. And it's very hard to avoid knowing the precise count once voting starts, since that's right at the top of the page. So I gave up, and read everything. If you're thinking of RfA and you're the anxious sort who will be constantly fighting the temptation to check in, line up something to do with friends/family so you don't glue yourself to the refresh button. There's no way around this bit. It sucks, and a week is a long time.
Luckily for me, people had really nice things to say. I really appreciate everyone who participated and everyone who came by to congratulate me afterwards. I'm doing my best to avoid the mushroom effect and to remain indifferent to both praise and blame. Thank you all for ignoring and then quietly removing the joke oppose; I was a bit worried they'd be murdered. As for the discussion period, I do think it helped things remain civil and on-track, but I'm ambivalent on the experiment overall.
Being an admin has been fun so far. Everyone's been really helpful, and if I've been driving anyone crazy with all my questions, they've been kind enough not to tell me. Special thanks to Aoidh, who has already stopped me from doing something stupid, kindly and firmly, like disarming a wayward toddler running with scissors.
Maybe my opinion isn't worth much, since I had such a smooth RfA compared to so many others, but if you think you'd probably succeed at RfA and you're holding back since it sounds like a bad way to spend a week: I think you should go for it.
It's possible that someone will drag up something stupid you said four years ago and try to rub your nose in it, or that some personal flaw of yours will be magnified beyond all reason. It's possible some people will try to use your RfA as a soapbox to complain about policies or norms they dislike. It's very likely that you will spend the whole week waiting for the other shoe to drop, whether any of those things happen or not. Even an uncontentious RfA can be an exhausting and unpleasant experience – mine was all of those things.
But it's also very likely that, over the course of a week, somewhere between one and three hundred people will show up to say something nice about you. Many of those people will be folks whose opinion you really, truly value. Some of them will be people you can't remember ever hearing of before, but who nonetheless have something deeply gratifying to say. You should run.
Plus, now I can tell my colleagues that I "have tenure, on Wikipedia." I'm sure they'll all be very impressed.
As part of WP:RFA2024, multiple RfA reform attempts have completed trials or are currently under review: you can read previous coverage on the matter by The Signpost in the 16 May issue.
There has already been consensus to add a reminder of RfA civility norms to WP:RFA, as well as limit suffrage to only extended-confirmed voters and formally require all nominees to also be extended-confirmed. All of these proposals were implemented in the last few months.
The "discussion-only period" trial has come to an end this month, having converted five different RfAs (non SNOW-closed) to have "discussion only" for the first two days out of the seven-day period. After this initial trial, Phase II discussions are ongoing to determine if this proposal will become permanent.
As per the outcome of the related Phase II discussion, admins can now designate themselves as monitors for RfAs, subject to minimum expectations for their conduct during the whole process. The full list can be found at WP:MONITOR. This proposal is intended to improve enforcement of civility guidelines during RfAs.
Phase II for the administrator recall proposal has also recently finished, having waited for a closer for several months. It will allow a community-initiated path to de-adminship by requiring certain admins to submit and pass their RfA again. Further discussion is ongoing on the next steps for this process.
Finally, the Admin Elections procedure is expected to trial in October: it will be a one-time trial to allow an alternate path to adminship, parallel to RfA. Candidates can sign up from 8 to 14 October, before entering a discussion period from 22 to 24 October, which will then be followed by a SecurePoll private voting session from 25 to 31 October. —S
The special elections for the Universal Code of Conduct Coordinating Committee (U4C) concluded earlier this month, with the election of just one candidate. With 613 votes cast between the 18 eligible candidates, only Ajraddatz (for the North America seat) achieved the 60% support-to-support+oppose ratio required. This gives the U4C just enough members (8 out of 16 seats) to establish their quorum, though it remains to be seen how U4C will handle inactive members.
The committee was set up primarily to deal with larger-scale disputes within smaller Wikis and to enforce the Universal Code of Conduct across the various projects; they are expected to begin hearing cases shortly. Further information can be found on the U4C announcements page.
The full results of the U4C elections can be viewed here. This cycle had already been covered in the 22 July issue of The Signpost. – —S
The Wikimedia Foundation published their bulletins for late August and early September. Among other news, they covered a public survey intended to better understand WikiProjects, the recent disbandment of the MCDC and the WMF Board of Trustees election, which is currently in its scrutiny phase.
It was also mentioned that the WMF will briefly switch the traffic between its data centers for maintenance purposes on 25 September, starting at 15:00 UTC. A banner will be displayed on all Wikis 30 minutes before the start of the operation, during which users will be able to read, but not edit, the sites for up to an hour. More information on the server switch can be found here.
Editors may also be interested in testing for the Charts Extension and the Alt Text experiment on the iOS app, the codified new API policy, or the WMF's newest update on Movement Strategy Grants (Spoilers: it focuses on Hubs). —S, O
In response to the increased prevalence of generative artificial intelligence, some editors of the English Wikipedia have introduced measures to reduce its use within the encyclopedia. Using images generated from text-to-image models on articles is often discouraged, unless the context specifically relates to artificial intelligence. A hardline Luddaite approach has not been adopted by all Wikipedians and AI-generated images are used in some articles in non-AI contexts.
The image guidelines generally restrict the use of images that are solely for decorative purposes, as they do not contribute meaningful information or aid the reader in understanding the topic. Despite this restriction, it appears that paintings are permitted to be included in medical articles to display human-made artistic interpretations of medical themes. They offer historical and cultural perspectives related to medical topics.
WikiProject AI Cleanup searches for AI-generated images and evaluates their suitability for an article. If any images are deemed inappropriate, they may be removed to ensure that only relevant and suitable images are kept in articles.
Perhaps the removed "scientific" images are the worst ones, however, even if they only affected one article, Chemotactic drug-targeting:
It may also be worth considering what kind of AI art is being left in articles by the WikiProject:
Policies vary between different language versions of Wikipedia. Differences in opinion among Wikipedians have resulted in the inclusion of text-to-image model-generated images on several Wikipedias, including the English Wikipedia. Many Wikipedias use Wikidata to automatically display images, which takes place beyond the scope of local projects.
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
A preprint titled "Language Agents Achieve Superhuman Synthesis of Scientific Knowledge"[1] introduces
"PaperQA2, a frontier language model agent optimized for improved factuality, [which] matches or exceeds subject matter expert performance on three realistic [research] literature research tasks. PaperQA2 writes cited, Wikipedia-style summaries of scientific topics that are significantly more accurate than existing, human-written Wikipedia articles."
It was published by "FutureHouse", a San-Francisco-based nonprofit working on "Automating scientific discovery" (with a focus on biology). FutureHouse was launched last year with funding from former Google CEO Eric Schmidt (at which time it was anticipated it would spend about $20 million by the end of 2024). Generating Wikipedia-like articles about science topics is only one of the applications of "PaperQA2, FutureHouse's scientific RAG [retrieval-augmented generation] system", which is designed to aid researchers. (For example, FutureHouse also recently launched a website called "Has Anyone", described as a "minimalist AI tool to search if anyone has ever researched a given topic.")
In more detail, the researchers "engineered a system called WikiCrow, which generates cited Wikipedia-style articles about human protein-coding genes by combining several PaperQA2 calls on topics such as the structure, function, interactions, and clinical significance of the gene." Each call contributes a section of the resulting article (somewhat similar to another recent system, see our review: "STORM: AI agents role-play as 'Wikipedia editors' and 'experts' to create Wikipedia-like articles"). The prompts include the instruction to "Write in the style of a Wikipedia article, with concise sentences and coherent paragraphs
".
With an average cost of $5.50, the generated articles tended to be longer than their Wikipedia counterparts and had higher quality, at least according to the paper's evaluation method:
We used WikiCrow to generate 240 articles on genes that already have non-stub Wikipedia articles to have matched comparisons. WikiCrow articles averaged 1219.0 ± 275.0 words (mean ± SD, N = 240), longer than the corresponding Wikipedia articles (889.6 ± 715.3 words). The average article was generated in 491.5 ± 324.0 seconds, and had an average cost of $4.48 ± $1.02 per article (including costs for search and LLM APIs). We compared WikiCrow and Wikipedia on 375 statements sampled from the 240 paired articles. [...] The initial article sampling excluded any Wikipedia articles that were "stubs" or incomplete articles. Statements were then shuffled and given, blinded, to human experts, who graded statements according to whether they were (1) cited and supported; (2) missing a citation; or (3) cited and unsupported. We found that WikiCrow had significantly fewer "cited and unsupported" statements than the paired Wikipedia articles (13.5% vs. 24.9%) (p = 0.0075, χ2 (1), N = 375 for all tests in this section). WikiCrow failed to cite sources at a 3.9x lower rate than human written articles, as only 3.5% of WikiCrow statements were uncited, vs. 13.6% for Wikipedia (p < 0.001). In addition, defining precision for WikiCrow as the ratio of cited and supported statements over all cited statements, we found that WikiCrow displayed significantly higher precision than human-written articles (86.1% vs. 71.2%, p = 0.0013).
For the judgment whether a particular statement was "supported" by the cited references, the concrete question asked to the graders (described as "expert researchers" in the paper) was:
"Is the information correct, as cited? In other words, is the information stated in the sentence correct according to the literature that it cites?"
In addition, among other more detailed instructions, the graders were advised to mark a statement correct as cited even if it was not directly supported by the source, as long as the statement consisted of "broad context" judged to be "undergraduate biology student common knowledge" (akin to an extreme interpretation of WP:BLUE).
The fact that these rating criteria appear to be more liberal than Wikipedia's own, combined with the well-known general reputation of LLMs for generating hallucinations, makes the "WikiCrow displayed significantly higher precision" result rather remarkable. The authors double-checked it by examining the data more closely:
The "cited and unsupported" evaluation category includes both inaccurate statements (e.g. true hallucinations or reasoning errors) and statements that are accurate with inappropriate citations. To investigate the nature of the errors in Wikipedia and WikiCrow further, we manually inspected all reported errors and attempted to classify the issues as follows: reasoning issues, i.e. the written information contradicts, over-extrapolates, or is unsupported by any included citations; attribution issues, i.e. the information is likely supported by another included source, but either the statement does not include the correct citation locally or the source is too broad (e.g. a database portal link); or trivial statements, which are true passages, but overly pedantic or unnecessary [...]. Surprisingly, we found that compared to Wikipedia, WikiCrow had significantly fewer reasoning errors (12 vs. 26, p = 0.0144, χ2 (1), N = 375) but a similar number of attribution errors (10 vs. 16, p = 0.21), suggesting that the improved factuality of WikiCrow over Wikipedia was largely due to improvements in reasoning.
The authors caution that this result about Wikipedians "hallucinating" more frequently than AI is specific to their "WikiCrow" system (and the task of writing articles about genes), and must not be generalized to LLMs in general:
Although language models are clearly prone to reasoning errors (or hallucinations), in our task at least they appear to be less prone to such errors than Wikipedia authors or editors. This statement is specific to the agentic RAG setting presented here: language models like GPT-4 on their own, if asked to generate Wikipedia articles, would still be expected to hallucinate at high rates.
A previous, less capable version of the WikiCrow system had already been described in a December 2023 blog post, which discussed the motivation for focusing on the task of writing Wikipedia-like articles about genes in more detail. Rather than seeing it as an arbitrary benchmark demo for their LLM agent system (back then in its earlier version, PaperQA), the authors described it as being motivated by longstanding shortcomings of Wikipedia's gene coverage that are seriously hampering the work of researchers who have come to rely on Wikipedia:
If you've spent time in molecular biology, you have probably encountered the "alphabet soup" problem of genomics. Experiments in genomics uncover lists of genes implicated in a biological process, like MGAT5B and ADGRA3. Researchers turn to tools like Google, Uniprot or Wikipedia to learn more, as the knowledge of 20,000 human genes is too broad for any single human to understand. However, according to our count, only 3,639 of the 19,255 human protein-coding genes recognized by the HGNC have high-quality (non-stub) summaries on [English] Wikipedia; the other 15,616 lack pages or are incomplete stubs. Often, plenty is known about the gene, but no one has taken the time to write up a summary. This is part of a much broader problem today: scientific knowledge is hard to access, and often locked up in impenetrable technical reports. To find out about genes like MGAT5B and ADGRA3, you'd end up sinking hours into reading the primary literature.
[The 2023 version of] WikiCrow is a first step towards automated synthesis of human scientific knowledge. As a first demo, we used WikiCrow to generate drafts of Wikipedia-style articles for all 15,616 of the Human protein-coding genes that currently lack articles or have stubs, using information from full-text articles that we have access to through our academic affiliations. We estimate that this task would have taken an expert human ~60,000 hours total (6.8 working years). By contrast, WikiCrow wrote all 15,616 articles in a few days (about 8 minutes per article, with 50 instances running in parallel), drawing on 14,819,358 pages from 871,000 scientific papers that it identified as relevant in the literature.
These challenges of covering the large number of relevant genes are not news to Wikipedians working in this area. Back in 2011, several papers in a special issue of Nucleic Acids Research on databases had explored Wikipedia as a database for structured biological data, e.g. asking "how to get scientists en masse to edit articles" in this area, and presenting English Wikipedia's "Gene Wiki" taskforce (which is currently inactive). In a 2020 article in eLife, a group of 30 researchers and Wikidata contributors similarly "describe[d] the breadth and depth of the biomedical knowledge contained within Wikidata," including its coverage of genes in general ("Wikidata contains items for over 1.1 million genes and 940 thousand proteins from 201 unique taxa") and human genetic variants ("Wikidata currently contains 1502 items corresponding to human genetic variants, focused on those with a clear clinical or therapeutic relevance").[2] But it seems that at least from the point of view of the FutureHouse researchers, Wikidata's gene coverage is not a substitute for Wikipedia's, perhaps because it does not offer the same kind of factual coverage (see also the review of a related dissertation below).
The current paper is not peer-reviewed, but conveys credibility by disclosing ample detail about the methodology for building and evaluating the PaperQA2 and WikiCrow systems (also in an accompanying technical blog post), and by releasing the underlying source code and data. The PaperQA2 system is available as an open-source software package. (This includes a "Setting to emulate the Wikipedia article writing used in our WikiCrow publication". However, the paper cautions that the released version does not include some additional tools that were used, and in particular does not provide "access to non-local full-text literature searches", which are "often bound by licensing agreements".) The generated articles are available online in rendered form and as Markdown source (see full list below, with links to their Wikipedia counterparts for comparison). The annotated expert ratings have been published as well.
The authors acknowledge "previous work on unconstrained document summarization, where the document must be found and then summarized, and even writing Wikipedia-style articles with RAG" (i.e. the aforementioned STORM project). But they highlight that
"These studies have not compared directly against Wikipedia with human evaluation. Instead, they used either LLMs to judge or [like STORM] compared ROGUE (text overlap) against ground-truth summaries. Here, we measure directly against human-generated Wikipedia with subject [matter] expert grading."
The "crow" moniker (already used in a predecessor project called "ChemCrow",[supp 1] an LLM agent working on chemistry tasks) is inspired by the fact that "Crows can talk – like a parrot – but their intelligence lies in tool use."
From the abstract of a dissertation titled "Exploiting semi-structured information in Wikipedia for knowledge graph construction":[3]
"[...] we address three main challenges in the field of automated knowledge graph construction using semi-structured data in Wikipedia as a data source. To create an ontology with expressive and fine-grained types, we present an approach that extracts a large-scale general-purpose taxonomy from categories and list pages in Wikipedia. We enhance the taxonomy's classes with axioms explicating their semantics. To increase the coverage of long-tail entities in knowledge graphs, we describe a pipeline of approaches that identify entity mentions in Wikipedia listings, integrate them into an existing knowledge graph, and enrich them with additional facts derived from the extraction context. As a result of applying the above approaches to semi-structured data in Wikipedia, we present the knowledge graph CaLiGraph. The graph describes more than 13 million entities with an ontology containing almost 1.3 million classes. To judge the value of CaLiGraph for practical tasks, we introduce a framework that compares knowledge graphs based on their performance on downstream tasks. We find CaLiGraph to be a valuable addition to the field of publicly available general-purpose knowledge graphs."
Why would one want to use Wikipedia as a source of structured data and build a new knowledge graph when Wikidata already exists? First, the thesis argues that Wikidata — even though it has surpassed other public knowledge graphs in the number of entitities — is still very incomplete, especially when it comes to information about long-tail topics:
"The trend of entities added to publicly available KGs in recent years indicates they are far from complete. The number of entities in Wikidata [195], for example, grew by 26% in the time from October 2020 (85M) to October 2023 (107M) [206]. Wikidata describes the largest number of entities and comprises – in terms of entities – other public KGs to a large extent [66]. Consequently, this challenge of incompleteness applies to all public KGs, particularly when it comes to less popular entities [44]. [...]
On the other hand, an automated process for extracting structured information from Wikipedia may not yet be reliable enough to import the result directly without manual review:
While the performance of Open Information Extraction (OIE) systems (i.e., systems that extract information from general web text) has improved in recent years [159, 97, 112], the quality of extracted information has not yet reached a level where integration into public KGs like Wikidata or DBpedia [104] should be done without further filtering. [...]
[...] first "picking low-hanging fruit" by focusing on premium sources like Wikipedia to build a high-quality KG is crucial as it can serve as a solid foundation for approaches that target more challenging data sources. The extracted information may then be used as an additional anchor to make sense of less structured data.
Chapter 3 ("Knowledge Graphs on the Web") contains detailed comparisons of Wikidata with other public knowledge graphs, with observations including the following:
The main focus of DBpedia is on persons (and their careers), as well as places, works, and species. Wikidata also strongly focuses on works (mainly due to the import of entire bibliographic datasets), while Cyc, BabelNet and NELL show a more diverse distribution. [...]
[...] Wikidata has the largest number of instances and the largest detail level in most classes. However, there are differences from class to class. While Wikidata contains a large number of works, YAGO is a good source of events. NELL often has fewer instances, but a larger level of detail, which can be explained by its focus on more prominent instances.
Wikidata contains about twice as many persons as DBpedia and YAGO [..., which] contain almost no persons which are not contained in Wikidata. In conclusion, combining Wikidata with DBpedia or YAGO for better coverage of the Person class would not be beneficial"
(see also an earlier paper co-authored by the author that was titled "Knowledge Graphs on the Web -- an Overview")
Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.
A Wikidata taxonomy (from "city or town" to "entity") before and after refinement |
From the abstract:[4]
"Wikidata is known to have a complex taxonomy, with recurrent issues like the ambiguity between instances and classes, the inaccuracy of some taxonomic paths, the presence of cycles, and the high level of redundancy across classes. Manual efforts to clean up this taxonomy are time-consuming and prone to errors or subjective decisions. We present WiKC, a new version of Wikidata taxonomy cleaned automatically using a combination of Large Language Models (LLMs) and graph mining techniques."
From the "Evaluation" section:
"As expected, WiKC is much simpler and much more concise than Wikidata taxonomy. Compared to WiKC, Wikidata taxonomy has a factor higher than 200 in the number of classes, and a factor higher than 10 in the average number of paths from an instance to the root class entity (Q35120)."
"WiKC consistently outperforms Wikidata across all depth ranges. WiKC shows significant accuracy gains at deeper levels (depth 10 or more), suggesting that WiKC has resolved many inconsistency issues in the lower levels of the Wikidata taxonomy."
From the abstract:[5]
"Hundreds of thousands of articles on English Wikipedia have zero or limited meaningful structure on Wikidata. Much work has been done in the literature to partially or fully automate the process of completing knowledge graphs, but little of it has been practically applied to Wikidata. This paper presents two interconnected practical approaches to speeding up the Wikidata completion task. The first is Wwwyzzerdd, a browser extension that allows users to quickly import statements from Wikipedia to Wikidata. Wwwyzzerdd has been used to make over 100 thousand edits to Wikidata. The second is Psychiq, a new model for predicting instance and subclass statements based on English Wikipedia articles. [...] One initial use is integrating the Psychiq model into the Wwwyzzerdd browser extension."
From the paper:[6]
"Translations help people understand content written in another language. However, even correct literal translations do not fulfill that goal when people lack the necessary background to understand them. Professional translators incorporate explicitations to explain the missing context by considering cultural differences between source and target audiences. [...] For example, the name “Dominique de Villepin” may be well known in French community while totally unknown to English speakers in which case the translator may detect this gap of background knowledge between two sides and translate it as “the former French Prime Minister Dominique de Villepin” instead of just “Dominique de Villepin”. [...]
This work introduces techniques for automatically generating explicitations, motivated by WIKIEXPL, a dataset that we collect from Wikipedia and annotate with human translators. [...]
Our generation is grounded in Wikidata and Wikipedia—rather than free-form text generation—to prevent hallucinations and to control length or the type of explanation. For SHORT explicitations, we fetch a word from instance of or country of from Wikidata [...]. For MID, we fetch a description of the entity from Wikidata [...]. For LONG type, we fetch three sentences from the first paragraph of Wikipedia."
From the abstract:[7]
"Knowledge Graph Construction (KGC) can be seen as an iterative process starting from a high quality nucleus that is refined by knowledge extraction approaches in a virtuous loop. Such a nucleus can be obtained from knowledge existing in an open KG like Wikidata. However, due to the size of such generic KGs, integrating them as a whole may entail irrelevant content and scalability issues. We propose an analogy-based approach that starts from seed entities of interest in a generic KG, and keeps or prunes their neighboring entities. We evaluate our approach on Wikidata through two manually labeled datasets that contain either domain-homogeneous or -heterogeneous seed entities."
From the abstract:[8]
"By analyzing the edit history of Wikipedia’s ‘hyperpop’ page, we locate ongoing debates, controversies, and contestations that point to shaping forces around online genre formation. These potentially have a huge impact on how hyperpop is understood both inside and outside of the music community. In locating the most active editors of the hyperpop Wikipedia page and scrutinizing their edit histories as well as the discussions on the hyperpop page itself, we uncovered debates about artistic notability, biases toward specific sources, and attempts at associating or dissociating musical genre from non-musical identities (such as race, gender, and nationality)."
From the abstract:[9]
"Paradoxically, in each language [English/French/Portuguese Wikipedia], the airplane has a different inventor. Through online ethnography, this article explores the multilingual landscape of Wikipedia, looking not only at languages, but also at language varieties, and unpacking the intricate connections between language, country, and nationality in grassroots knowledge production online."
{{cite conference}}
: CS1 maint: DOI inactive as of September 2024 (link) Code/data
Rank | Article | Class | Views | Image | Notes/about |
---|---|---|---|---|---|
1 | Oasis (band) | 1,428,910 | Mancunian brothers Liam Gallagher and Noel Gallagher (plus other musicians that weren't as fond of squabbling) spent the 90s and 2000s making music and fighting each other, with the latter part escalating so much that the brothers broke up their own band in 2009. So what a relief for Oasis fans to know that, in the wake of the 30th anniversary of the band's debut album, Definitely Maybe, a 2025 reunion tour was announced, initially limited to the British Isles, but with hopes that it will expand to other parts of the world, especially as tickets have sold out fairly quickly. The rest of Oasis will include three more past members: guitarists Paul "Bonehead" Arthurs (who played on the band's beloved first two albums and the divisive and overblown third) and Gem Archer (who joined in the tour for the fourth album and remained until the end), and drummer Chris Sharrock (part of the band's doomed final tour). Some members from Noel's side band, the High Flying Birds, will also join in. | ||
2 | Johnny Gaudreau | 1,067,759 | "Johnny Hockey" was a standout name in the NHL, but sadly his career was cut short at just 31, as he and his brother Matthew (who also played hockey) died after being run over while riding their bikes in New Jersey. An outpour of commotion emerged, including from Gaudreau's current and former teams, as the late winger left behind two young children, while his sister was forced to cancel her wedding. Everyone also hopes the drunk driver who caused the accident faces the consequences. | ||
3 | Deaths in 2024 | 1,004,696 | You need more time 'Cause your thoughts and words won't last forever more And I'm not sure if it'll ever work out right... | ||
4 | Stree 2 | 1,002,390 | This Bollywood comedy horror film, the fifth installment in the Maddock Supernatural Universe and a sequel to the first chapter of the series, released on the Indian Independence Day, and has soon become the highest-grossing Hindi film of 2024, as well as the ninth highest-grossing Hindi film of all time. It is also the second highest-grossing Indian film of 2024, behind only Tollywood epic science fiction movie, Kalki 2898 AD. | ||
5 | Deadpool & Wolverine | 959,740 | The 34th film in the MCU was first released a few months ago, bridging the franchise with the Fox Universe and becoming the first ever MCU film to receive an R-rating; it has now become the seventh highest grossing film from the franchise, surpassing Iron Man 3. | ||
6 | Robert F. Kennedy Jr. | 859,701 |
Kennedy hasn't fully dropped out of the presidential race — but he's withdrawing his name from most swing states, making sure that he won't affect the electoral vote, but still can be voted for in most states. As of writing, his website says that Kennedy is aiming for 5% or more of the popular vote, as that will qualify him for public funding in future elections. | ||
7 | Pavel Durov | 857,832 |
Anti-establishment, Russian, and a billionaire, Durov is the founder of VK, the so-called "Facebook for Russia", as well as the fifth most-used messaging service, Telegram. Because of the latter, he was arrested in France on August 24, reportedly for failing to remove drug traffickers and child pornographers on the platform. Russia is not happy, although Durov himself left the country in 2014, after refusing to hand over the personal data of Euromaidan protesters in Ukraine to the Federal Security Service. | ||
8 | Sven-Göran Eriksson | 817,262 | The Swedish football coach died of pancreatic cancer on August 26, at the age of 76. He was probably best known for managing England's Golden Generation in the 2002 and 2006 FIFA World Cups, but also won lots of national and European titles with the likes of IFK Göteborg, Benfica and Lazio. | ||
9 | Tulsi Gabbard | 746,047 | Just like #6, another former Democratic Presidential hopeful turned right-leaning Independent, who is now working with Donald Trump. | ||
10 | Liam Gallagher | 731,595 | After the unceremonious split of #1 in 2009, their lead singer spent five more years with other musicians from the band in side group Beady Eye before pursuing a solo career, which earlier this year included a collaboration with Stone Roses guitarist John Squire in the indicatively titled Liam Gallagher & John Squire. Now, though, the main focus is on the 2025 reunion tour of Oasis. |
Rank | Article | Class | Views | Image | Notes/about |
---|---|---|---|---|---|
1 | Emily Armstrong (musician) | 1,428,603 | The lead singer of Dead Sara has been enlisted for the revival of a very popular nu metal group (#5) that previously disbanded under tragic conditions.
However, Linkin Park's decision to appoint Armstrong has rapidly faced criticism, due to her perceived support of convicted rapist and disgraced actor Danny Masterson, as well as her ties to the anti-psychiatry Church of Scientology; both are very triggering topics for a considerable part of fans who have been grieving for the loss of late lead singer Chester Bennington, including one of his own children. Armstrong did address her past ties to Masterson, but as of this issue's publication, she has not clarified her status within the Scientology Church, yet. | ||
2 | Rich Homie Quan | 1,144,320 | On September 5, Dequantes Devontay Lamar, better known as Rich Homie Quan, passed away at the age of 34, with the cause of death being still unclear in spite of an autopsy.
The Atlanta-born and based rapper scored a few hits in the 2010s, like "Type of Way", "Flex (Ooh, Ooh, Ooh)" and the team-up single with Young Thug, "Lifestyle". | ||
3 | The Greatest of All Time | 1,087,524 | This Kollywood science fiction action film, starring Thalapathy in his 68th film in a lead role, was released last Thursday.
It is expected to be Vijay's penultimate film (or the last one, depending on his plans to call off his next project), since the actor announced his political entry earlier this year. As a result, his fans did not waste their (presumably last) chance to celebrate the film worldwide: the film grossed ₹126.5 crore (US$15 million) in the opening day global box office, emerging as the second highest-grossing Tamil film in 2024, while also giving the actor two consecutive films to gross over ₹100 crores in the opening day. In the process, the film also set many other records, such as the first South Indian film to be shown in more than 20 locations in Norway, including the iSense hall at Odeon Cinemas etc. | ||
4 | Deaths in 2024 | 981,764 | When it will be right? I don't know. What it will be like? I don't know. We live in hope of deliverance From the darkness that surrounds us... | ||
5 | Linkin Park | 967,977 | The tragic death of lead singer Chester Bennington first led to Linkin Park's disbandment back in 2017. However, on September 5, they officially announced a reunion, with original members Bennington and Rob Bourdon being replaced with #1 and Colin Brittain, respectively.
On the same occasion, the band delighted fans further by teasing a new studio album and a six-date arena tour, but public reception grew colder in the following days, as concerning details about the past of new co-vocalist Armstrong surfaced. | ||
6 | Indian Airlines Flight 814 | 966,085 | Netflix show IC 814: The Kandahar Hijack recently recalled an incident from 25 years ago, describing how an airplane was hijacked by five terrorists from Harkat-ul-Mujahideen, under support by al-Qaeda (who'd infamously hijack four planes in a single day in 2001), keeping 174 passengers and 11 crewpeople hostage for a week, while stabbing one to death to intimidate everyone else, before releasing them in the Afghan city of Kandahar in exchange for three imprisoned terrorists. Those responsible have never been identified. | ||
7 | Beetlejuice Beetlejuice | 865,176 | It's showtime! Yes, the recently released sequel and the 1988 original were directly above each other in views, allowing a proper summoning of The Ghost with the Most. Life (or afterlife, in this case) is funny like that sometimes. Anyway, after a cartoon series and a musical, it was finally time for the devious and hyperactive ghost Betelgeuse to haunt the cinemas again, with Beetlejuice Beetlejuice bringing back director Tim Burton and actors Michael Keaton (Beej himself), Winona Ryder (goth Lydia Deetz, now the host of a supernatural talk show) and Catherine O'Hara (Lydia's stepmom Delia). | ||
8 | Beetlejuice | 813,131 | |||
9 | Deadpool & Wolverine | 774,393 | Two aggressive Mutants teamed up in a movie full of blood, swearing and multiversal shenanigans, that has recently made over a billion dollars, becoming the second highest-grossing movie of this year, behind only Inside Out 2 (which, in turn, rightfully became the highest-grossing animation ever, overtaking a very unnecessary remake).
On a side note, Marvel Cinematic Universe has returned later this month with the Disney+ show Agatha All Along. | ||
10 | Emma Navarro | 756,697 | This 23-year-old American tennis player, who won her first title at the Hobart International back in January, has just had her best Grand Slam performance during the US Open at home, where she eventually lost in the semi-finals to Aryna Sabalenka, who then proceeded to win it all over another American, Jessica Pegula. |
Rank | Article | Class | Views | Image | Notes/about |
---|---|---|---|---|---|
1 | James Earl Jones | 3,034,998 | One of the most respected African American actors ever, particularly for that booming voice that spoke iconic lines like "No. I am your father", "Everything the light touches is our kingdom", "Steel isn't strong, boy, flesh is stronger!" and "This is CNN", James Earl Jones had a storied career of nearly seven decades, resulting in him being one of the few who won the EGOT (albeit with an honorary Oscar). Retired ever since starring in Coming 2 America in 2021, he died on September 9, at the age of 93. | ||
2 | Laura Loomer | 1,434,117 | Donald Trump's supporters are concerned about all the influence he's receiving from this activist and self-proclaimed "pro-white nationalist" and a "proud Islamophobe". | ||
3 | Kamala Harris | 1,284,808 | The vice-president and the Democrats' candidate in the upcoming election, who was on millions of screens at last week's presidential debate. | ||
4 | The Greatest of All Time | 1,153,526 | This Kollywood science fiction action film, starring Thalapathy in his 68th film in a lead role, was released last week. Earlier this year, the actor announced his political entry, making this movie his penultimate work ever, thus his fans celebrated it worldwide. The film debuted at the second place at the worldwide box office, behind only #7, on its opening weekend. It also became the second Tamil film to enter the global weekend chart of Comscore, with the other film being the same actor's previous film. The film grossed ₹350 crore (US$42 million) worldwide in its opening week, against a budget of ₹300−400 crore, becoming the the highest-grossing Tamil film of 2024 and the fourth highest-grossing Indian film of 2024. Thalapathy's next and last film has now been announced, and the fans are already preparing for the One Last Dance! | ||
5 | September 11 attacks | 1,132,328 | 23 years have passed since this tragedy took place in three different US locations, where many lost their loved ones due to a terrorist attack masterminded by Al-Qaeda. For those interested, it also marks 23 years since the release of the criminally underrated soundtrack from the famously bad movie by this writer's favourite chanteuse, Mariah Carey. The picture on the left summarizes all of it. | ||
6 | The Perfect Couple (TV series) | 1,010,108 | This American mystery drama series, which is an adaptation of the 2018 novel of the same name by Elin Hilderbrand and includes Nicole Kidman (pictured) in its cast, premiered on Netflix last week. | ||
7 | Beetlejuice Beetlejuice | 1,009,928 | 36 years later, director Tim Burton and actors Michael Keaton, Winona Ryder and Catherine O'Hara returned to a small Connecticut city haunted by weird ghosts. Filled with Burton's signature aesthetics and many funny moments, including a musical number featuring one of the corniest songs ever written, this movie was warmly received and had an impressive $111 million opening in North America alone, which was already enough to cover its production budget. | ||
8 | Deaths in 2024 | 999,325 | Given one of the quotes from #1 (better use this song now than when the unnecessary prequel arrives): 'Til we find our place On the path unwinding In the circle The circle of life | ||
9 | Beetlejuice | 881,184 | The death list interrupted, but with three mentions of his name, "it's showtime"! Fresh off Pee-wee's Big Adventure, in 1988 Tim Burton directed the story of a recently deceased couple trying to scare away two yuppies (and their goth daughter) from their old house, at a certain point bringing in the hyperactive undead maniac the movie is named after. The Geffen Company tried to get a sequel shortly after the movie, with Burton himself thinking about doubling the insanity in Beetlejuice Goes Hawaiian, but Betelgeuse and Lydia Deetz only had a cartoon and a musical adaptation before the proper follow-up listed above. | ||
10 | Rebel Ridge | 679,584 | This film, starring Aaron Pierre as a former Marine and Don Johnson as the chief of a corrupt police force, was released by Netflix on September 6 to positive reviews. |
For the August 16 – September 16 period, per this database report.
Title | Revisions | Notes |
---|---|---|
Deaths in 2024 | 2083 | Along with the ones mentioned above and a few others from the last Traffic Report, other notable deceased of the period included Phil Donahue, Sid Eudy, Rebecca Cheptegei and Sérgio Mendes. |
Andrew Jackson and the slave trade in the United States | 1404 | Back in June, user Jengod split from the list of American slave traders this page regarding the seventh U.S. President Andrew Jackson's involvement in slave trading, and has been working on it extensively ever since. |
India at the 2024 Summer Paralympics | 1392 | For all of India's underperformances at the Olympics, their disabled para-athletes have impressive showings at the Paralympic Games. Just this year in Paris, the country managed to win 29 medals, which is the same amount they have got in the last 52 years of the Olympics! These include seven golds, four of which in athletics and one each in shooting, archery and badminton. |
List of Kamala Harris 2024 presidential campaign endorsements | 1259 | A laundry list of supporters for the current VP, including Republicans that dislike their own candidate. |
Great Britain at the 2024 Summer Paralympics | 1149 | Like Greece created the Olympics, Great Britain is the origin of the Paralympics. They're the second most successful nation at the games, behind (no surprise) the United States, and Paris 2024 had them in second place in the medal table, behind only China, with 124 medals, 49 of which golden. |
2024 Pacific typhoon season | 1100 | Wikipedia's branch that edits cyclone pages is a very dedicated bunch, as reflected in over 200 Featured Articles. Updates came for the storms formed in the Pacific, the strongest of which was Typhoon Yagi, that hit Southeast Asia and South China in early September. |
Typhoon Yagi | 1060 | |
Electoral fraud in the United States | 1012 | The upcoming election turns this into a relevant topic. The article even notes the current Republican candidate both claimed this as for why he lost the popular vote in 2016 to Hillary Clinton, and for why he outright was defeated by Joe Biden four years later (he unsuccesfully sued, his party tried to change how the elections work, and his fans went too far in refusing defeat). |
Brazil at the 2024 Summer Paralympics | 910 | A relatively recent emerging Olympic power that has won at least 10 medals in all editions since 1996, Brazil has also been a proven Paralympic potency since 1984, ranking 16th in the all-time Paralympic Games medal table (albeit one of the teams above them is the now defunct state of West Germany), probably a reflection of abrangent publicly funded health care. Paris 2024 had them rank fifth in the medal table with 89 medals, 25 of them golden. |
List of Canadian hip hop musicians | 874 | Most people probably only know Drake, but one user is doing his best to extend this list. |
2024 US Open – Men's singles | 871 | In both genders, this year's tennis Grand Slams played on hard courts were won by the same player: after Aryna Sabalenka got the Australian Open-US Open double, it was time for Jannik Sinner to do the same, overcoming his adversary Taylor Fritz despite him having the support of the Flushing Meadows crowd. A notable fact is that aside from Sinner, the other three semifinalists - Fritz, Frances Tiafoe, and Jack Draper - were all reaching that phase of a Grand Slam for the first time. |
2024 Democratic National Convention | 809 | The 2024 DNC was held in Chicago from August 19 to 22, a loud, boisterous convention with lots and lots of speeches. It's also where delegates selected the presidential nominee. Kamala Harris, the current Vice-President who was literally the only candidate, was selected. |
2024 Summer Paralympics medal table | 757 | A few weeks after the Summer Olympics, it was time for athletes with disabilities to get the spotlight, in the same host city of Paris. Aside from the United States not being as dominating - they were third, behind China and Great Britain - a big difference for this year's Paralympics compared to the Olympics was the presence of nearly three times more Russians and Belarusians competing for no flag (they would have ranked fifth if not for the neutrality excluding them from the medal table, thus officially that place is from the aforementioned Brazil). |
2024 Ligas Departamentales del Perú | 745 | From para-athletes to regular ones, as one user made lots of updates on Peru's fifth division football tournament. |
Black Myth: Wukong | 742 | A video game released on August 20 and based on the 16th century Chinese fantasy novel Journey to the West, a classic in Chinese literature. There was much hype surrounding its release (it topped Steam's best-selling chart and smashed the same platform's record for highest-concurrent player count, a record previously held by Elden Ring) and has released to positive reception. The game gained noteriety when its developer, Game Science, was accused of being sexist by IGN in November 2023, and the game got even more noteriety when streamers were instructed to not mention "feminist propaganda", "quarantine", "COVID-19", and Chinese politics, in order to comply with Chinese law. If the Chinese government doesn't appreciate things said about them, they can take you away like they did with Jack Ma. |