Nominations for four new seats on the Funds Dissemination Committee (FDC) have just closed. The FDC is a volunteer WMF body that strongly influences how a high proportion of donors' funds are spent. The Committee was created in 2012 after the Foundation Board stopped all but a few chapters from directly processing donations raised on its behalf. The move brought the review and funding of most "eligible" affiliates under more centralised community review, and the FDC has since been the primary instrument for scrutinising their applications for recurrent operating expenses, and recommending to the Board who should get what. Thus far the Board has accepted all of the FDC's recommendations from the twice-yearly application rounds.
The FDC has been no stranger to controversy. In late 2015, with the Foundation in turmoil over issues of governance, funding, and leadership, the FDC stepped outside its official mandate to publish a scathing critique of the WMF’s performance. This step was well received: the WMF accepted and acted on the feedback (video 14:27), and invited the FDC’s comments in the subsequent funding round.
The FDC's two-year memberships have been designed around a leap-frogging process of yearly alternating Board appointments of four volunteer voting members and community elections of five volunteer voting members. In addition, the Board maintains a close relationship with the FDC through the appointment of two of its own members as non-voting FDC members (currently ex-chair of the FDC, Dariusz Jemielniak and Guy Kawasaki), and the participation of three non-voting staff members in FDC processes (Katy Love, Winifred Olliff, and Delphine Ménard, herself a former voting member of the FDC).
The current round is for the four Board appointments, and has attracted 13 self-nominees. Nominations closed just over a week ago, after which there was a short public Q&A. FDC staff will now confer with the two FDC board representatives to draw up a shortlist by 5 August; a decision on the final four candidates will be finalised in consultation with the Board, and announced 2 September. In the last, 2014 round of Board appointments, the factors weighed up by the Board's were reported by a Trustee as: "solid Wikimedia contributions (online or offline); complementary non-Wikimedia background, some in finance or budgeting or program evaluation/review (not all though and that's not a must); grantmaking/reviewing experience in Wikimedia; chapter leadership/exec experience or non-chapter contributor; geographical, age, language, wiki diversity; reasoned, analytical responses to the Q&A on Meta and during the interviews."
The FDC's charter explicitly requires membership diversity; in practice, this has clearly been a stumbling block, probably for a complex set of reasons that are difficult to resolve. Of the five ongoing voting members (recommended last year via community election), two are native-speakers of English, all are males, and all are from the global north; of the four whose terms are about to become vacant, two are native-speakers of English, three are males, and three are from the global north (depending on how the south–north boundary is defined). Among the 13 nominees, six declare themselves to be native-speakers of English and nine as male, with roughly eight from the global north.
Questions from Wikimedia community members involved the duties of Wikimedia affiliates regarding paid editing; depth of experience in evaluating grants; diversity (specifically, geographic and gender); the need for innovation, its tension with evaluating success by standard measures, and the value in sharing stories of successes and failures; the nature of what FDC money should fund; and its relation to the committee’s volunteer community.
Two candidates came to Wikimedia through high-profile roles with the WMF (which they have since left): Garfield Byrd as chief financial officer, and Bishakha Datta as trustee. Several have extensive experience in Wikimedia governance, including past service on the FDC itself. Several women, and several candidates from the global south, could increase the diversity of perspectives on the FDC. Some have very little experience with editing wikis, and point to their professional backgrounds for their main qualifications. Along with this elaborate matrix of backgrounds, the staff and Board will need to factor in the relevant expertise and experience required by a task that at certain times during the year will require a full-time effort by members. Former chair of the FDC and now member of the WMF Board, Dariusz Jemielniak, posed a significant question to all candidates on the Q and A page:
“ | ... the FDC requires a lot of past experience in evaluating grants (not just writing grant proposals, which also is a must, but of having had a chance to read and compare 100+ applications for money), or extensive professional background in management, strategy, finance, or auditing. | ” |
Jemielniak asked for nominees' attitudes to this proposition, and how they saw their background in relation to it. There was surprising variety in the responses, revealing something of a clash of cultures between valuing on-the-ground programmatic experience and professional, technocratic expertise, although some nominees emphasised the need for both dimensions to be represented on the Committee. One answer asserted a strongly different view of the importance of grantmaking experience:
“ | Some of the best grantmakers I've worked with in foundations had none of these skills. But they had other skills needed to make good grants. They had domain knowledge or expertise. They had a vision. | ” |
The other answers echoed one or both of these positions, reflecting a range of views of the relative merits of grantmaking experience and programmatic experience.
The Board and FDC staff face a range of competing needs in their judgment of the nominations. Not only are there issues of diversity and grantmaking-related professional skills; there is the need to prepare the FDC, and grantmaking more broadly, to grapple with deeper issues over time. Among these are the inherent difficulty of predicting and measuring impact-value for money of programmatic activities on WMF sites and their readers; and the likelihood that we are entering a period in which the model for fundraising is under challenge. – T and P
With regret, the Signpost passes on the news that Geoff Brigham finished up on 18 July as general counsel and secretary to the Foundation, after five years of service. Geoff, who came to the WMF from a very senior role at eBay, posted a message to the Wikimedia mailing list expressing his love for "the mission, the Foundation, the Wikimedia communities, and my colleagues at work ... I stand in awe of the volunteer writers, editors, and photographers who contribute every day to the Wikimedia projects. The future of the Foundation under Katherine's leadership is exciting."
Executive director Katherine Maher replied to Geoff:
“ | You’ve seen the Foundation through a remarkable five years. You’ve built a tremendous team that is critical to helping the Wikimedia projects thrive well into the future. You’ve expertly navigated our challenges, focusing our efforts where we can have the most impact. Through your team, you've empowered the Foundation as fierce advocates for open licensing, privacy, freedom of information, and contributors rights, truly embodying the values of our movement. And as a colleague, you’ve been a counselor and voice of wisdom for our executive team and Board of Trustees. | ” |
Michelle Paulson will be interim head of legal, and Stephen LaPorte will be interim secretary to the Board (pending Board approval). Geoff will take up the position of director of YouTube Trust & Safety, managing global teams for policy, legal, and anti-abuse operations. We wish him well. – T
The Arbitration Committee (ArbCom) recently decided to implement a new type of restriction for pages on certain topics with intractable and long-running disputes, such as the Gamergate controversy. It barred editing from anonymous (IP) users and registered editors with fewer than 30 days tenure and 500 edits.
Initially, a series of edit filters enforced the restriction. In January 2016, an editor proposed a new protection level called extended confirmed protection ("ECP" or "30/500", for short) with the same function. Although the proposal received some complaints regarding the instruction creep it presented to new editors, it was eventually approved and technically implemented, with editors being granted the "extendedconfirmed" user right after reaching the requirement. ECP was rolled out on April 5, with ArbCom passing a motion allowing administrators to use ECP to prevent sockpuppetry when less restrictive protection fails to work.
Since that time, ECP occasionally deviated from its ArbCom use: without raising the eyebrows of many, it was used for other reasons, such as to prevent BLP violations. Within three months, an administrator made a proposal allowing use of ECP for any purpose, not just for ArbCom and sockpuppetry: that, with community scrutiny, administrators would be allowed to use ECP protection. The RfC gave editors three options:
The RfC has received a wide range of inputs, with most non-administrators and administrators supporting the third option, and some non-administrators and a few administrators supporting the first and second options. Proponents of the third option believe ECP would be valuable in stopping disruption, while its opponents believe that it would deter newcomers and disenfranchise occasional editors.
Genetically modified organisms (GMOs) have been a controversial topic for years on Wikipedia, and one with a less than peaceful environment: a number of editors have been sanctioned by ArbCom for poor decorum in GMO discussion, and "discretionary sanctions" have been implemented to stabilize GMO articles.
Wikipedia's coverage of the safety of GM foods in particular has been a source of conflict. Many editors believed the then-current wording on GMO safety was inadequate and provides little context:
There is a general scientific agreement that food from genetically modified crops is not inherently riskier to human health than conventional food, but should be tested on a case-by-case basis. No reports of ill effects have been proven in the human population from ingesting GM food. Although labeling of GMO products in the marketplace is required in many countries, it is not required in the United States and no distinction between marketed GMO and non-GMO foods is recognized by the US FDA. In a May 2014 article in The Economist it was argued that, while GM foods could potentially help feed 842 million malnourished people globally, laws such as those being considered by Vermont's governor, Peter Shumlin, to require labeling of foods containing genetically modified ingredients, could have the unintended consequence of interrupting the process of spreading GM technologies to impoverished countries that suffer with food security problems.
— Pre-RfC version of second paragraph of Genetically modified organism#Controversy.
To help settle the question, a RfC to change the current wording was opened. Moderated under tight conditions, with strict word limits and behavioral restrictions, there were 22 proposals; nearly 90 editors participated. After one month of discussion, the RfC was closed on July 7, and the first proposal prevailed:
There is a scientific consensus that currently available food derived from GM crops poses no greater risk to human health than conventional food, but that each GM food needs to be tested on a case-by-case basis before introduction. Nonetheless, members of the public are much less likely than scientists to perceive GM foods as safe. The legal and regulatory status of GM foods varies by country, with some nations banning or restricting them, and others permitting them with widely differing degrees of regulation.
— Proposal 1, Wikipedia:Requests for comment/Genetically modified organisms
GMO articles faced a less-than-smooth transition afterwards, as several editors debated the best way to include the new language and replace the old. In the first few days after the RfC was closed, additional text was deleted and replaced while some editors debated whether to change language immediately before and after the RfC-mandated language. Approximately a week later, those disagreements had calmed down.
The Hindu reported about an edit-a-thon on Indian women scientists held on July 16 in Bangalore. Their pre-event article noted that only about 40 women scientists from the country currently have Wikipedia entries, and many of those are incomplete or lack citations.
The paper's followup article reported that about 25 editors participated in the event, creating and updating articles on prominent women scientists in the country. Sandhya Srikant Visweswariah, chair of the Department of Molecular Reproduction, Development and Genetics at the Indian Institute of Science, was among the subjects tackled. One participant noted, however, that "lack of citations online made it hard to validate entries for many women scientists from the country". This, of course, is a persistent concern, as discussed in part in The Atlantic last month. Having content online leads to the production of more content. Creating new material from non-online content – and being able to use that content to defend Wikipedia's processes of validating content and assessing notability – is a much bigger task although also an essential one.--Milo
Cracked.com featured a critical piece on Wikipedia as "shockingly biased", with input from current administrator Crisco 1492. The piece falls squarely in the sweet-spot of modern criticism of any website: (1) it comes from a website that loves Wikipedia; (2) has readers who love Wikipedia and use it all the time despite its faults; and thus (3) will read any articles, which raises "shocking" concerns about Wikipedia. And though the items discussed are mostly old-hat to Wikipedia editors (not to discount their importance), such articles are usually popular. This one has already received over 350,000 views and 450 comments.
The topic areas discussed in the article include three common complaints: (1) the lack of diversity in contributors and content, such as the gender gap and systemic biases (see The Hindu edit-a-thon discussed above), and the focus of some editors on niche content areas; (2) the ever-present problem of vandalism, but particularly the feedback loop where inaccuracies are cited in the press – "like a game of telephone, only at the end of the game, the garbled nonsense gets published in a newspaper"; and (3) petty arguments among editors, though this discussion also ends in more discussion of vandalism, such as those quixotic editors who like to change heights and weights.
The article also cites the Wikipediocracy website as one "dedicated to destroying Wikipedia", though such a threat does not seem as existential when described as "less like a public service and more like a bunch of Mensa wannabes trying to high five, only to awkwardly smack each other in the nose". Lastly, the piece concludes that "Wikipedia is dying", citing statistics about declining numbers of "very active" editors and the lack of sufficient administrators.
All of these concerns have degrees of validity, and though not precisely news, the continuing focus on them is no doubt important in finding solutions. When high-profile articles stop being written about Wikipedia's flaws, that would suggest irrelevance, which is a much surer sign of decline. No one complains about the functionality or value of Myspace anymore.--Milo
Seven featured articles were promoted these weeks.
Five featured lists were promoted these weeks.
One featured topic was promoted these weeks.
Fourteen featured pictures were promoted these weeks.
Your Traffic Reports for the weeks of June 26 – July 2, and July 3–9, 2016:
The dominant topic in Wikipedia traffic the week of June 26 to July 2 was sports, and more particularly football, with UEFA Euro 2016 in the top spot for third straight week. And Iceland's improbable team (#22 in the WP:TOP25) victory over England in UEFA Euro 2016 put that country's article at #5. Lionel Messi's (#4) defeat at the Copa América (#12) final, and his subsequent retirement announcement, was also big news. In other news, the hangover from Brexit (#25) kept the European Union (#9) in the top ten for a second week, and put Boris Johnson (#13) and Theresa May (#17) on the Top 25 as well. Game of Thrones also merits a mention, taking slots #2 and #3, and its season finale episode article at #18.
Moving on to the week of July 3–9, sports dominated again this week, with the traditional return of Wimbledon joining the lead-up to the UEFA Euro 2016 football tournament, the latest UFC event, and an unexpected team change for an NBA superstar. But it was a sport of an entirely modern kind, Pokemon Go, that led the pack, and before you ask, yes, Pokemon is an esport. Traditional summer distractions such as movies and television round out the list, with the inclusion of politicians Donald Trump and Andrea Leadsom after the Top 10 to remind us (barely) of the real world.
For the full top-25 lists (and our archives back to January 2013), see WP:TOP25. See this section for an explanation of any exclusions. For a list of the most edited articles every week, see WP:MOSTEDITED. For the most popular articles that ORES models predict are low quality, see WP:POPULARLOWQUALITY.
The ten most popular articles on Wikipedia, as determined from the WP:5000 were:
Rank | Article | Class | Views | Image | Notes |
---|---|---|---|---|---|
1 | UEFA Euro 2016 | 1,590,000 | A third straight week in the top spot, though with less than half as many views as last week. The Round of 16 commenced on June 25, and the quarter-final rounds were underway when this weeks' report closed July 2. The final four teams were Portugal, Wales(!), Germany, and France, with the next match on July 6. | ||
2 | Game of Thrones | 1,126,688 | Last week the Season 6 article was #6, while this general series article was #16 (with 730K views). Why the switch this week? No doubt it is because the season finale on June 26 (The Winds of Winter) (#18) caused more mainstream press coverage, prompting more people unfamiliar with the show to look it up on Wikipedia to see what they were missing. | ||
3 | Game of Thrones (season 6) | 1,103,448 | See #2. Numbers up slightly from last week. | ||
4 | Lionel Messi | 1,060,930 | Up from #21 and 564K views last week. The Argentine forward and "best footballer on the planet"TM faced Chile in the Copa America Centenario final on June 26, and lost on penalty kicks after a 0–0 draw. The 29-year-old Messi announced his retirement after the game. | ||
5 | Iceland | 784,708 | Views spiked on the northern island country's article on June 27 and 28. On June 27, Iceland defeated England 2–1 in their UEFA Euro 2016 Round of 16 match. But alas, Iceland fell to France on July 3 and did not make the semi-finals. | ||
6 | Pat Summitt | 764,584 | The longtime head coach of the Tennessee Lady Volunteers basketball team, who won a record 1,098 games in her tenure, died at age 64. She retired from coaching in 2012 after being diagnosed with early-onset Alzheimer's disease. Her public openness about her condition was widely admired and helped raise awareness of the disease and its impact. | ||
7 | Battle of the Somme | 757,121 | The 100th anniversary of the commencement of this First World War battle fell on July 1. The battle was intended to hasten a victory for the Allies and was the largest battle of the First World War on the Western Front. More than one million men were wounded or killed, making it one of the bloodiest battles in history. | ||
8 | Independence Day: Resurgence | 755,170 | The 20-years-later sequel to Independence Day premiered in the United States on June 24. As of July 4, its worldwide gross is $252 million; the film had a budget of $165 million. It has received mostly negative reviews, though I fully intend to see it. Down slightly from 810K views last week. | ||
9 | European Union | 736,104 | Views previously spiked on June 24 due to the Brexit vote, but traffic remained high (though declining each day) for this entire week as the aftermath of the vote began to be digested. Down from #3 and 1.97 million views last week. | ||
10 | Jesse Williams (actor) | 720,299 | At the BET Awards on June 26, this actor won a humanitarian award, and delivered a speech highlighting racial injustice, police brutality, and cultural appropriation, which drew press attention far beyond anything the BET Awards normally gets. (BET is an acronym for Black Entertainment Television, the most prominent television network targeting African American audiences.) |
The ten most popular articles on Wikipedia, as determined from the WP:5000 were:
Rank | Article | Class | Views | Image | Notes |
---|---|---|---|---|---|
1 | Pokémon Go | 1,371,390 | For most people born before the Clinton administration, Pokémon is about as comprehensible as the religious customs of some lost Pacific island or the codes and shibboleths of an ancient secret society. Which, by the way, is exactly why your kids love it so much. It's too complicated to explain quickly but the latest iteration exploded into the public mind almost overnight (it currently has more users than Tinder in the US, despite only being in release for 5 days) due to its unique, and perhaps uniquely dangerous, gameplay. Thanks to the wonders of augmented reality, Google Maps and GPS, a real-time scavenger hunt has morphed with a video game; it's everywhere you go. Hold up your iPhone to a tree, there's a Pokémon sitting in a crook, waiting to be captured and sent to the death ring, er, I mean "gym". Look down on the pavement, and there's a cute Pokémon staring up at you. And hey look! There's one swimming in that deceptively close and surprisingly deep pond! And there's one across that very busy street! Yes, Pokemon Go-related accidents have already happened, as have muggings, since the game alerts any other players to your current location. Thankfully none of this has proven fatal, though it's only a matter of time before a health official is forced to remind the general public that real people do not get extra lives. | ||
2 | Sultan (2016 film) | 1,152,393 | One big difference between Hollywood and Bollywood is that in Bollywood, stars still matter. And Salman Khan (pictured) rules the roost right now. His last big film, Bajrangi Bhaijaan, dominated Eid al-Fitr weekend and went on to make nearly $100 million. And now he's done it again. His latest, a wrestling drama, was also released on Eid and has taken in nearly ₹1.96 billion ($29 million) in its first six days. | ||
3 | Independence Day (United States) | 1,142,261 | This is the fourth US Independence Day since we started this list, which means it's time to look for patterns, and one that stands out is that while this article's numbers keep climbing year upon year, it has never been the #1 article for its week. Some have speculated that Americans already know enough about their founding holiday and don't need to look it up. | ||
4 | UEFA Euro 2016 | 988,687 | Numbers are down slightly for the quarter- and semi-finals, which saw the darlings of the tournament (Wales and Iceland) predictably knocked out by France and Portugal. This list's timeframe ends before the 10 July final so expect numbers to shoot up again next week. | ||
5 | Juno (spacecraft) | 960,161 | Not all NASA missions need to be glamorous; this one, which began a slow, winding descent towards Jupiter on 4 July, won't be gracing us with grand vistas of the jewels of the Jovian realm. No: this one is hardcore, pick-to-the-cliff science. Have you ever seen a cutaway image of the inside of a gas giant? Well if not, here's one. Thing is, up until now, it's basically educated guesswork. We don't have any hard evidence of what's under those clouds. But we will, thanks to Juno, which will get the info by mapping Jupiter's gravitational field. But to do so, it has to get close. Real close. As in, close enough to be fried by Jupiter's 12,000-Chernobyls-per-second radiation belts. Needless to say, it's a tough little bugger, but its creators don't expect it to be producing useful science for more than 18 months before it's toast. | ||
6 | Nettie Stevens | 896,719 | This pioneering geneticist and discoverer of the XY sex-determination system got a Google Doodle on her 155th birthday on 7 July. | ||
7 | UFC 200 | 872,178 | The latest in the mixed martial arts tournament series was held at the T-Mobile Arena in Las Vegas (pictured) on 9 July. Headliner Amanda Nunes defeated Miesha Tate in the first round. | ||
8 | Serena Williams | 857,452 | The world women's number 1 tennis champion clinched yet another record on 9 July when she beat Angelique Kerber in straight sets to clinch her 22nd major singles title at her natural home, Wimbledon. Two more titles and she will equal Margaret Court's career record. | ||
9 | Antoine Griezmann | 849,627 | Olivier Giroud may have scored two goals for France in the Euro 2016 semi-final, but it was Griezmann who scored the most goals in the tournament. | ||
10 | Kevin Durant | 707,764 | The seven-time NBA All-Star signed with the Western Conference champion Golden State Warriors this week for a reported two-year, $54 million contract. |
On 10 June, arbitrator clerk L235 posted an announcement that the clerks were looking for script writers who "will work with the clerk team to automate portions of the clerks' procedures." These procedures include, but are not limited to, vetting new requests, opening and managing open cases and miscellaneous tasks such as arbitrator retirements. On 7 July, L235 announced that Fred Gandt and Σ would be appointed as the script developers. Best of luck to both on future outings.
If any editor is interested in assisting, you can contact the clerks at clerks-l@lists.wikimedia.org.
A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
A short paper presented at the Joint Conference on Digital Libraries titled "Quality Assessment of Wikipedia Articles Without Feature Engineering"[1] uses deep learning to predict the quality of articles in the English Wikipedia. As the paper's title alludes, previous research on article quality has used a specific set of features to represent the articles, whereas the promise of deep learning is that the machine learner will determine the best representation on its own.
Some representation of the articles still requires to be chosen, and the paper uses "Doc2Vec", an extension of Word2vec that uses unsupervised machine learning to learn vector representations of the articles. A benefit of this approach is that it is language neutral, whereas other approaches might utilize features that are language-specific. These vectors are learned from a training set based on the Wikimedia Foundation's dataset of 30,000 English articles. A deep neural network using Google’s TensorFlow library is then trained using these vectors with the aim to predict to which of the English Wikipedia’s assessment classes an article belongs.
The performance of the classifier is compared to the current state of the art, which at the time of writing is the WMF's own Objective Revision Evaluation Service (ORES) (disclaimer: the reviewer is the primary author of the research upon which ORES' article quality classifier is built). Since the number of articles in each class is fairly balanced, the proportion of correctly classified instances (accuracy) is used as the performance measure. ORES is reported to be 60% accurate (it currently reports 61.9% accuracy), and the deep neural network was found to be 55% accurate. As pointed out in the paper, this work is a first step towards using deep learning for this task, meaning that slightly lower performance is acceptable. The authors describe a couple of changes that will most likely improve the classifier and aim to do so in future work. Deep learning is an area where interesting things are happening, and if it can be used to improve our ability to automatically assess Wikipedia articles, a service that is already useful to many Wikipedians through services like WikiProject X and SuggestBot, that is only for the better!
Dr. Tsung-Ho Liang (梁宗賀)[supp 1] is a systems analyst in the information center at the Tainan City Government's Bureau of Education. He currently studies big data in education, especially dealing with unstructured data and natural language processing techniques. In 2013, he started a project to integrate the contents of Chinese Wikipedia with the Chinese Knowledge and Information Processing (CKIP) technology and established a new search engine for Chinese Wikipedia,[supp 2] – WikiSeeker (維基嬉客).
WikiSeeker is a tailor-made search system based on the Wikipedia corpus to leverage search effectiveness by providing structured association graphs with related Wikipedia articles for students' queries in Chinese. First, it produces a knowledge map with clear relationships among each field of knowledge, so students can easily identify the most important keywords among contents. Second, the search bar of WikiSeeker is capable of using natural language to search instead of typing keywords. You can see a tour of WikiSeeker on Youtube.
The above two features make WikiSeeker intuitive and easy to use for K-12 students. According to the research essay "WikiSeeker─The Study of the Impact of a Search System with Structured Association Graphs on Learning Effectiveness" [2] by the researcher Sheng-Nan Cheng (鄭盛南), two experimental groups were adopted in this study: one asks students to use Chinese Wikipedia directly to answer questions, and another asks students use the WikiSeeker website to answer the same questions. The results showed that the students who used WikiSeeker were 10.8% more correct in their answers (on average, 13.73 out of 19, compared to 15.8 out of 19 questions). Moreover, it was found that girls and middle-achieving students reached the highest learning improvement when using WikiSeeker. The conclusion suggests that WikiSeeker is suitable for students to acquire knowledge in Chinese Wikipedia.
Sentiment analysis - the automated extraction of subjective information expressed in text - has been applied to Wikipedia research in several recent papers.
Four researchers from Stanford University analyzed[3] all (non-neutral) votes in the English Wikipedia's request for adminship process cast from its inception in 2003 until 2013. These form a directed, signed graph with around 11,000 nodes (users) and 160,000 edges (votes). They removed the actual vote text ("support" and "oppose") and tried to reconstruct the vote by applying sentiment analysis to the remaining comment text (where e.g. "I’ve no concerns, will make an excellent addition to the admin corps" indicates a positive vote). The performance of the resulting prediction model is described as "remarkably high, [...] as a consequence of the highly indicative, sometimes even formulaic, language used in the comments". It performed much better than a model trying to predict votes based on network characteristics alone (patterns of other support/oppose votes, using e.g. ideas from balance theory like "an enemy of my enemy is my friend").
Is the editing frequency of Wikipedians influenced by negative or positive comments they receive on their user talk pages?
A student course project at the same university[4] tried to examine this question by analyzing the user talk pages of all users (around 620,000) who signed up in 2013 and made at least one article edit on the English Wikipedia, together with "thanks" messages received via the new software feature introduced during that year. They related this data to the number of article edits per week. The authors report that "while we found some predictive value for future behavior in the sentimental content of messages received by Wikipedia editors, we do not have evidence to establish a causal relationship between these variables... we were able to detect macro-level patterns of behavior that appear to discredit the hypothesis that the sentimental content of user talk pages is a main driver of user churn on Wikipedia". As a limitation of their application of sentiment analysis in this situation, they note that "Most messages exchanged through user talk pages are not sentimentally-loaded, but rather talk about the Wikipedia guidelines and policies in a neutral manner", calling for the use of more sophisticated natural language processing techniques.
These results are somewhat in contrast to those of a paper titled "The Impact of Sentiment-driven Feedback on Knowledge Reuse in Online Communities",[5] which investigated "whether affective communication [...] in form of sentiment-driven feedback in discussions between Wikipedia editors motivates collaborative work", by analyzing a complete history dump of the Simple English Wikipedia (until 2011). The researchers focus on the "knowledge reuse" aspect of this collaborative work, quantified for "any two consecutive revisions of the same article page as the ratio of the number of words reused from the previous revision (e.g., copied, moved elsewhere, or restored) to the number of words newly created in the current revision." By relating the positivity or negativity of article talk page comments to editing activity in the article itself, the authors found that:
Besides observing that public positive feedback may have a positive effect on editor motivation, they also note that "non-public negative peer feedback could increase one’s likelihood to engage in online social production by correcting inherent problems, behaviors, and attitudes in private peer conversations, which also strongly suggests that mechanisms for providing non-public negative feedback should be designed, incorporated, and tested in collaborative platforms such as wikis."
See also our earlier coverage of sentiment analysis research, and a current research collaboration of the Wikimedia Foundation and other researchers that aims "to use machine learning and statistics to understand how attacking or 'toxic' language affects the contributor community on Wikipedia. The focus of our analysis is initially on talk page comments that exhibit harassment, personal attacks and aggressive tone."
Wikimania 2016, the annual global Wikimedia conference, took place in June in Esino Lario, Italy. The programme contained various research-related session, including the annual "State of Wikimedia Research" presentation highlighting some of the most interesting scholarship from the past year (slides).
See the research events page on Meta-wiki for upcoming conferences and events, including submission deadlines.
A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.
Other student project writeups from the fall 2015 CS229 course at Stanford (see also above):