The Signpost

Research interview

The Huggle Experiment: interview with the research team

Contribute  —  
Share this
By Skomorokh and Tony1
The Summer of Research 2011 team working at Wikimedia Foundation's San Francisco headquarters
Research fellow Steven Walling.

As part of the 2011 Summer of Research, the Wikimedia Foundation's Community Department has announced an experiment to investigate potential improvements to the new editors' experience of their first contact with patrollers, using the Huggle anti-vandalism tool. The Summer of Research is a three-month intensive project to study aspects of participation in Wikipedia that may have a significant effect on editor retention. It brings together a group of researchers, mostly PhD candidates, who have experience in both computer science and the social sciences, to give us a more well-rounded understanding of participation in the projects. (See also earlier Signpost coverage: "Wikimedia Summer of Research: Three topics covered so far", "WMF Community Department announces 'Summer of Research' participants")

The Signpost interviewed researchers R Stuart Geiger (who uses the Staeiou account for non-research editing), Aaron Halfaker, and Wikimedia Foundation Fellow Steven Walling to find out more. Steven has been a volunteer editor on the English Wikipedia since 2006, and before taking up the Foundation Fellowship was a professional writer and blogger, mostly for technology publications and companies. Stuart has been a Wikipedia editor since late 2004, and has been studying the project as an academic since his undergraduate senior thesis in 2006. Since then, he's been gathering from a number of fields the conceptual, theoretical, and methodological tools necessary to study something as complex as Wikipedia. "At present, I'm a doctoral student from the School of Information at the University of California, Berkeley, and I have a keen interest in both the digital humanities and social statistics movements." Aaron is a computer science graduate student from the University of Minnesota. He's been an editor since 2008 and has published academic research on Wikipedia since WikiSym 2009. He specializes in statistical data mining and designs user-scripts for Wikipedia to understand/improve editor interactions.

How did the Summer of Research project come about, and what questions will it investigate? According to Steven, the experiment aims to test "warning templates that are explicitly more personalized and set out to teach new editors more directly, rather than simply pointing them to policy and asking them not to do something". Steven says he personally got involved because, as a Fellow at the Foundation, research has been part of his job. "I currently share the responsibility for leading the project team with Diederik van Liere and Maryana Pinchuk. Diederik has experience with the technical side of this project, Maryana is a qualitative researcher with an academic background, and I lend community experience to round out the leadership team. We built an enormous, multi-part question list publicly on Meta. But it turns out that was just a beginning guide. We've been structuring the summer as a series of weekly sprints, and to get a feel for the research topics that have been and are currently being explored, I'd check out the public list on our Meta page. Because the team has a wide variety of skills, we've looked at many different aspects of Wikipedia as a community so far."

The spaces where en.wiki newbies asked for help in the project. Fewer than half of the newbies investigated received a response from a real person during their first 30 days.
Anon edits have declined as a proportion of logged-in edits, but are still running at 20%, representing a significant potential source of new Wikipedians
Article deletion is a major part of the newbie experience. Newbies are receiving more notifications that their articles are being deleted, but are participating less in deletion discussions.

Aaron said they decided to experiment with Huggle's standardised warning system because the project goal is to understand the decline in new editors, so it seemed logical to focus on new editors' experience in the community. "Team-member Dr. Melanie Kill suspected that welcome messages might have an effect on how new editors perceive the community. So because Hugglers send out the most messages to new editors, we wanted to see if we could improve conversion (from damage) and other retention rates by just changing the wording of the message."

We wondered how the Foundation's sometimes lofty strategic goals, like "Support the recruitment and acculturation of newer contributors", are translated into practical initiatives such as this. Steven points to the Board resolution on Openness and the Foundation's Annual Plan for 2011–12. "Recruiting and retaining editors for Wikipedia is now one of our top priorities, and Zack Exley, our Chief Community Officer, designed the summer to really dig deeper into the exact areas of English Wikipedia and other projects that have the largest effect on new editors, and whether those editors stick around. The Editor trends study gave us a high-level understanding of the trends in participation, but it didn't tell us with certainty what internal community factors most have an impact. We need to have more data we're confident in if we're going to make good decisions, thus the Huggle experiment, which is clarifying that automated editing tools have a huge impact on new editors. The project was in the sweet spot of being able to gather a statistically significant sample quickly and with minimal impact on the normal functioning of the community."

Stuart's background seems ideally matched to an experiment that seeks to understand social phenomena using technical methodologies. "I'm an adherent of the sociotechnical systems approach, which thinks in terms of how social and technical phenomena are inherently intertwined, especially when we study processes in communities as technologically mediated as Wikipedia. Our motto, 'the free encyclopedia that anyone can edit' speaks to this principle that the Wikipedia community can't be fully understood without taking into account the code on which it runs – and vice versa. Huggle is a great example of this: scripts, tools, and bots like Huggle, Twinkle, and User:ClueBot have become the predominant way in which new users are introduced into Wikipedia. In fact, here's a statistic that is hot off the research press: almost 75% of newbies have their first talk page message sent to them from one of those semi- or full-automated software systems."

How were the parameters of the experiment decided on – for example, the number of warnings delivered, the proportion of changed warnings? Aaron says they settled on three variables for testing in the experiment: personalized, teaching-oriented, and image. "Dr. Kill, a professor of rhetoric, produced personalized and teaching-oriented versions of the default warning template for Huggle; Stuart and Aaron then expanded these templates with image/no-image versions and prepared a random template generator. Our requirement for the number of experimental welcomes/warnings is based on a bit of statistical algebra that allows us predict how many observations we'll need to find statistically significant differences between the variables."

The Huggle experiment is not the first to investigate the interactions of patrollers and new page creators. In the 2009 community-led Newbie treatment at Criteria for speedy deletion experiment (Signpost coverage), experienced editors (one Signpost interviewer included) posed as inexperienced article creators to look into how new contributors are treated in the patrolling process. The experiment attracted significant controversy due to ethical concerns surrounding the lack of informed consent of the participants. Steven says that before the experiment they posted a public notice at the Village Pump. "We also spoke directly with the main Huggle developers over email, IRC, and on-wiki (Addshore, Gurch and other volunteer developers deserve a lot of credit here; we couldn't have done this without their help and consent beforehand). I should probably point out that we felt pretty confident about this experiment because Stuart is a prolific Huggler himself. Even if we had no volunteer editing experienced as a team, I think the key difference between this and the treatment experiment you referred to is that we've been transparent about our actions before going forward with it."

Aaron points out that Huggle users come across hundreds of potential editors every day, and a surprising proportion of these editors are testing whether they can, in fact, edit Wikipedia by damaging an article. "We suspect that the reaction these potential editors receive affects whether they'll register an account and try contributing productively. We hypothesized that the tone of the welcome/warning message could be an important factor in this decision. We have Hugglers testing a few variations of the 1st-level warning message to find out if we're right."

So can we expect more experiments like this in the future? Steven says he could probably do an entire Signpost report just on this topic. "But let me give it a quick shot by saying that the recently released Annual Plan (see link above) will give you a very good idea of what direction we're focusing on, as well as activity on mediawiki.org, the tech portion of blog.wikimedia.org, and the impending software deployments page. We try to make sure to push a message locally here when new experiments with features or anything else is happening, but those three places are where to look if you're interested in these topics in the future."

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

"Fewer than half of the newbies investigated received a response from a real person during their first 30 days". I think we really dropped the ball here. Interaction is a major way to recruit newbies and hopefully turn them into "regulars". OhanaUnitedTalk page 05:18, 2 August 2011 (UTC)[reply]

I agree: personal mentoring is the key, but to allocate such resources requires the identification of the most likely newbies. How to do that? Perhaps some more focused research questions? I wonder whether the research project will involve the gathering and coding of qualitative data from newbies/anons. Tony (talk) 08:43, 2 August 2011 (UTC)[reply]
One does have to be careful, though; I fully hope that we drive the vandals and SEO upstarts away, which I'm guessing is probably over half of all new users at this point. I suspect the percentage will be dramatically better once we implement a requirement to become autoconfirmed to create articles; there's no possible way we can leave customized messages for all of the people we encounter. The Blade of the Northern Lights (話して下さい) 19:09, 2 August 2011 (UTC)[reply]
Hi, thanks for the comments on this topic! We are definitely going to be qualitatively analyzing the edits which all of these users make after receiving each of the different warnings. This is actually our primary way of evaluating the success of each of the templates -- a simple 'do they continue to edit' isn't good, because we don't want persistent vandals and spammers to keep editing. Then we will be able to run a bunch of interesting analyses on how different kinds of new users react to these different messages. It will be interesting to see if, for example, the more personalized warnings drive away vandals but not link spammers, or if the warnings with teaching messages are better at "converting" users who make test edits into good content contributors.
As to the time needed to personally interact with new users, this is a definite problem that we are very interested in, and we are working on trying to model which new users are more likely to become good contributors in the future. We are thinking of a new user welcoming suite like Huggle, but where you can look at a newbie's first few edits and then leave one of a dozen or so targeted welcome messages. So if you see a user fixing a lot of spelling errors to articles about Canada, you'd be able to thank them for copyediting and invite them to join WikiProject Canada in just a few clicks. And if you have any other comments, questions, or suggestions, I'd be happy to hear them. StuGeiger (talk) 20:20, 2 August 2011 (UTC)[reply]
Thanks, Stu. You said, "we are working on trying to model which new users are more likely to become good contributors in the future" – this is the most important thing I've heard in this discussion. I think we'll be hoping you can find sufficiently distinctive patterning as early as possible in the editing history of the newbie-pluses (the ones we want to keep) and the newbie-minuses (vandals, link-spammers, and paid political/corporate operators). I suppose it will be a combination of factors such as (i) the linguistic patterns, (ii) the locational distribution of the edits (which pages are edited), and (iii) the temporal distribution of the edits. How these three aspects interact could do with some heavy-duty stats analysis, and of them, the linguistic is likely to be the most challenging and deepest (a research delimitation is required, I think).

Perhaps two critical concerns will govern the efficiency with which the problem can be addressed: (i) how long into a newbie's edit-history the patterns become clear, and (ii) the extent to which they can be identified by a bot (including whether a bot could do the initial "easy" filtering and pass a minority on to human eyes for higher-level sorting to identify the promising newbie-pluses for human interaction – a three-tiered filtering, as it were). Of particular interest might be the grey area of newbies – not those who will clearly stay and those who clearly won't (or who we clearly do or don't want to stay), but those where final stage, human interaction, has a reasonable likelihood of making the difference, of bringing them over the line. Finding the best bot/human mechanism for rationing the supply of "newbie mentors" to this prioritised editorial demographic, IMO, is the challenge. After that, a future project could work on developing guidelines for the best ways in which to interact with newbie-pluses. Tony (talk) 02:41, 3 August 2011 (UTC)[reply]





       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0