A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.
Who did what: editor role identification in Wikipedia is the title of an upcoming paper to be presented at the International Conference on Web and Social Media (ICWSM) in Cologne, Germany.[1] The work presented in the paper is a collaboration between researchers from Carnegie Mellon University and the Wikimedia Foundation. The authors' goal was to analyze edits from the English Wikipedia to identify roles played by editors and to examine how those roles affected the quality of articles.
Identifying roles of participants in online communities helps researchers and practitioners better understand the social dynamics that lead to healthy, thriving communities. This line of research started in the 2000s, focused on Usenet groups, before expanding to wiki communities like Wikipedia.[supp 1]
The paper covers the three stages of work:
For the first stage, the authors built on previous publications that aimed at classifying Wikipedia edits, in particular the work by Daxenberger et al.[supp 2] Classifying edits usually starts by separating them by namespace. A more granular approach considers not just the namespace, but the content of the change. This was the method chosen here for edits in the main namespace, with the possibility of assigning a revision to multiple categories: for example, a single revision can entail both "grammar" and "template insertion" changes. Those categories were operationalized using an ensemble method classifier based on the content and metadata of the edit.
Then, the authors derived roles based on patterns that emerged from the classes of edits, using the latent Dirichlet allocation method (LDA). This method is traditionally used in natural language processing to identify topics making up a document. Here, the authors used the method to identify roles making up a user, positing that a user is a mixture of roles in the same way that a document is a mixture of topics. In addition to edits, they trained the LDA model using other information such as reverts, and edits in other namespaces.
They ended up with eight roles: social networker, fact checker, substantive expert, copy editor, wiki gnomes, vandal fighter, fact updater, and Wikipedian. They found that most editors play between one and three of those roles. To validate the roles, they attempted to predict edit categories based on the editors' roles, with mixed results.
Last, the authors examined whether the roles of editors were correlated with the evolution of the quality of a set of articles. They measured article quality twice, six months apart, using an existing model[supp 3] that assigns a score in Wikipedia's qualitative assessment scale based on the article's measurable characteristics.
They found some correlation between the difference in quality and the roles involved, taking into account control variables like the starting quality score. Their results suggest that "the activities of different types of editors are needed at different stages of article development". For example, "as articles increase in quality, the substantive content added by substantive experts is needed less" but "the cleanup activities by Wiki Gnomes become more important".
One limitation acknowledged by the authors is that their detailed edit classification was only performed on edits made in the main namespace (Wikipedia articles). For other edits, they only considered the namespace itself. Namespaces like Wikipedia:
are host to very varied activities, and applying the same level of detail to them would presumably yield a richer, and possibly more accurate, taxonomy of roles.
Some choices in the role nomenclature are a little surprising. For example, it seems odd to have one role simply called "Wikipedians", or "reference modification" being a behavior representative of "social networkers". Translating patterns of data (structural signatures) into words (roles) is a difficult exercise, and often a weak link in such analyses.
In conclusion, the article is a welcome contribution to the field of Wikipedia research, in particular of editor roles on Wikipedia. Many previous role identification efforts have used a simplified approach where editors were reduced to their main role. In contrast, here the authors went further and considered editors as a mixture of roles, which is expected to provide a more accurate representation of human behavior.
Since the authors mention task recommendation as a possible application of their work, it would be particularly interesting to examine how the role composition of a user evolves over time. There may be patterns in the evolution of users' roles during their life cycle as editors. Uncovering such patterns could lead to more relevant task recommendations, and help guide editors along their contribution journey.
This paper was published in the Information Sciences journal and was co-authored by researchers from several Polish universities.[2] The paper's central research question is "are the popular assumptions about the social interpretations of networks created from the edit history valid?" The paper evaluates four different methods for constructing complex networks from Wikipedia data and comparing these constructs with survey results about Polish Wikipedians' self-reported relationships. While there is a strong correspondence between all the different network types, networks derived from Wikipedians' talk pages map most clearly onto Wikipedians' feelings of acquaintanceship.
The paper examines four kinds of relationships: co-edits to article and user talk pages (acquaintanceship), co-edits in the vicinity of other users' text (trust), reverts of editors' revisions (conflict), and co-edits to articles in the same category (shared interest). Crucially, the paper extends prior research using these network constructs by conducting a respondent-driven survey of Wikipedians to ask them to name other Wikipedians they consider to be acquaintances, trusted, conflict-prone, or having the same interest. The survey respondents tended to be more experienced than typical users and so responses were re-weighted based on population frequency.
The paper goes on to use a variety of machine learning methods to evaluate the strength of the relationship between different behavioral features and the self-reported relationships. First the find that naive constructions of these networks from behavioral data only end up predicting one kind of relationship (discussion/acquaintanceship). Using more complex sets of temporal features such as days since last edit and category similarity to account for biases in self-reporting yielding only marginal improvements in model performance. The authors conclude by suggesting that the correspondence between relationships imputed from observed Wikipedia data and the relationships reported by Wikipedians themselves are weak.
The survey methods employed in this paper to generate the ground-truth networks can be criticized by the lack of randomness in the population or generalizability across other wiki communities. Similarly, there are well-known limits on informant accuracy compounded by the often impersonal nature of the editing interface and process. Nevertheless, this research suggests that researchers combining behavioral data social network methods may be making faulty assumptions about how strong the observed relationships are actually perceived by the Wikipedians themselves.
This study[3] from researchers at the University of Helsinki examines cross-correlations between Wikipedia pageviews, news media mentions, and company stock prices. This work extends prior work that developed a trading strategy based on Wikipedia pageviews to assess stock market moves[4][5] by extracting entities about companies, products, and dates from news media mentions and matching them to Wikipedia entries. An exploratory case study demonstrates there are some correlations across these three indices and that the strongest cross-correlations are observed without a time lag and for the same company. However, in a subsequent case study involving 11 large companies, the strongest cross-correlations were for The Home Depot and Netflix. That correlations among news mentions, Wikipedia pageviews, and stock performance is neither theoretically nor empirically surprising, but the paper's work on identifying entities and mapping them to Wikipedia articles could have some potential. Research like this comparing correlations across dozens of entities and time series is subject to multiple comparisons problems and there's likewise a large body of methods in mathematical finance that can be used to extend these findings further.
A calendar of events (mostly research conferences) relevant to Wikimedia-related research has recently been set up on Meta-wiki. Notable entries for this month include CHI 2016 and ICWSM-16.
This conference paper[6] presents a method to automatically detect promotional content in Wikipedia. It appears to aim at articles, but the actual method focuses on user pages.
The authors highlight the fact that their method is purely text-based, whereas "[c]urrently most researches about spamming in Wikipedia are focusing on editing behavior and making use of user’s edit history to do feature-based judging." (See, however, our earlier coverage of a related paper that reported success using stylometric, i.e. text-based features: "Legendary, acclaimed, world-class text analysis method finds you promotional Wikipedia articles really easily")
The researchers explain that a "traditional bag-of-words document vector representation" (counting only word frequencies) is insufficient. Instead, they "employ a deep learning method to obtain a word vector for each word and then apply a sliding window on each document to gradually gain the document vector." The classifier was trained on a dataset of user pages speedily deleted under criterion "G11. Unambiguous advertising or promotion", compared to user pages of administrators which were assumed to be advertising-free. In tests (which apart from Wikipedia user pages also included a dataset of web page ads drawn from other sites) it "produced better performance than the bag-of-words model in both precision and recall measurements."
A list of other recent publications that could not be covered in time for this issue – contributions are always welcome for reviewing or summarizing newly published research.
{{cite journal}}
: |issue=
has extra text (help); |volume=
has extra text (help)
Discuss this story