The Signpost


Recent research

YOUR ARTICLE'S DESCRIPTIVE TITLE HERE

Contribute   —  
Share this
By Tilman Bayer, ...


A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.


Benchmarking Data Practices of Art Museums in Wikidata

[edit]
Reviewed by Kasia Makowska (WMPL)

This discussion paper[1] is part of the Special Collection of the Journal of Open Humanities Data (JOHD): Wikidata Across the Humanities: Datasets, Methodologies, Reuse that focuses on Wikidata as both a tool and an object of academic research. The paper looks at the adoption of key open data best practices, focusing on art museums in Wikidata. The work is outlined in 3 steps: i) selection of repositories; ii) definition of open data compliance criteria; and iii) reporting the results. For the selection of repositories, art museums (using the item “art museum”, Q207694 as the reference point) with at least 5,000 records in Wikidata were chosen, and the sample was limited to the 10 museums with the most records in Wikidata.

When if comes to defining the compliance criteria, the authors say:

“(...) the work seeks to answer the following questions: 1) What criteria can be used to assess the compliance of Art museums’ open data practices with Wikidata? 2) Which Art museums are most represented on Wikidata, and what is the level of maturity in their data practices and ecosystem integration? The purpose of this work is to define a set of best practices for open data publishing in Wikidata and to benchmark the current level of compliance among major Art museums. The results will provide a clear roadmap for institutions to improve their open data strategies.”

They then define a set of data quality criteria, as described below:

The results are then reported and discussed: 10 preselected institutions are assessed based on the above criteria. A full table of results with detailed scores can be found in the paper, with a brief spoiler alert for the less patient readers:

"In light of all these assessments, it can be stated that the National Gallery of Art demonstrates the highest level of open data compliance maturity and can be considered a best practice example."

When discussing the results, the authors clearly and transparently outline the limitations of their work, in scope and coverage, and point out additional topics to consider as extension of this work. Interestingly, they mention two criteria (the provision of machine-readable metadata and clear licensing information), which do not form part of the assessment in the paper. This is because analysis shows these to be “not binary properties of an institution but rather emergent characteristics of digital collections”, which is followed by a proposal to reframe them as quantifiable “metadata footprints”. The paper also provides an interesting analysis using the copyright status property on Wikidata, with a chart clearly illustrating artwork with documented licence or copyright status within each museum’s Wikidata records.

In summary, this work provides a really useful benchmark of practices for museums willing to start using Wikidata to enrich and reuse their digital collections. Speaking from an affiliate perspective, such work is a valuable guide for speaking with GLAM institutions, presenting them with good practice examples and suggesting space for improvement.

A final note from the authors highlights another important use for such research:

"More importantly, because it clearly highlights the geographical bias in Wikidata, it can also be seen as a call to action: all the top museums in Wikidata (by number of records) are located in the Global North[2]. This is not a coincidence, but rather a reflection of the material and institutional resources required for the sustained digital cultural work that facilitates integration with platforms like Wikidata. This disparity, however, risks creating and reinforcing digital silos that reproduce the unequal global distribution of knowledge. By mapping this limitation, our article aims to raise awareness of this inequity and contribute to scholarly and practical efforts to diversify the digital cultural sphere."

...

[edit]
Reviewed by ...

...

[edit]
Reviewed by ....


2026 Wikimedia Research Fund announced

[edit]

The Wikimedia Foundation's Research department announced the launch of the 2026 Wikimedia Research Fund". It funds

Research Proposals (Type 1), Extended Research Proposals (Type 2), and Event and Community-Building Proposals (Type 3). [...] The maximum request is 50,000 USD (Type 1 and 3) and 150,000 USD (Type 2).

Letters of intent for research proposals (Type 1 and 2) are due by January 16, 2026, and full proposals for all three types on April 3, 2026.

See also our related earlier coverage:

Briefly

[edit]

Other recent publications

[edit]

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome.

"Investigating the evolution of Wikipedia articles through underlying triggering networks"

[edit]

This paper in the Journal of Information Science (excerpts) considers networks that have "factoids" as nodes, and associations between them as edges, and finds e.g. that "the inclusion of one factoid [on Wikipedia] leads to the inclusion of many other factoids". From the abstract:[3]

"In collaborative environments, the contribution made by each user is perceived to set the stage for the manifestation of more contribution by other users, termed as the phenomenon of triggering. [...] In this work, we analyse the revision history of Wikipedia articles to examine the traces of triggering present in them. We also build and analyse triggering networks for these articles that capture the association among different pieces of the articles. The analysis of the structural properties of these networks provides useful insights on how the existing knowledge leads to the introduction of more knowledge in these articles [...]"

From the "Discussion" section:

"Our analysis on triggering networks of Wikipedia articles not only validates and extends the old classical theories on the phenomenon of existing knowledge triggering the introduction of more knowledge but also provides useful insights pertaining to the evolution of Wikipedia articles. Examining the network structure reveals many properties of the triggering phenomenon. For example, a well-defined community structure clearly endorses that the inclusion of one factoid leads to the inclusion of many other factoids. Moreover, many of the factoids belonging to a subtopic are introduced together. Furthermore, the core-periphery structure and the degree distribution suggest that all the factoids do not have a similar triggering power. Some factoids lead to the introduction of many more factoids and hence are paramount in the article development process than the factoids. The introduction of these factoids in the articles may be considered as milestones in the article evolution process. Overall, the study explains one of the reasons behind collaborative knowledge building being more efficient than individual knowledge building."

See also our coverage of a related earlier publication by the same authors at OpenSym 2018: "'Triggering' article contributions by adding factoids"

"Throw Your Hat in the Ring (of Wikipedia): Exploring Urban-Rural Disparities in Local Politicians' Information Supply"

[edit]

From the abstract:[4]

"This study [...] employs a dataset of politicians who ran for local elections in Japan over approximately 20 years and discovers that the creation and revisions of local politicians' pages are associated with socio-economic factors such as the employment ratio by industry and age distribution. We find that the majority of the suppliers of politicians' information are unregistered and primarily interested in politicians' pages compared to registered users. Additional analysis reveals that users who supply information about politicians before and after an election are more active on Wikipedia than the average user. The findings presented imply that the information supply on Wikipedia, which relies on voluntary contributions, may reflect regional socio-economic disparities."

"..."

[edit]

From the abstract:

...

"Wiki Loves iNaturalist: How Wikimedians Integrate iNaturalist Content on Wikipedia, Wikidata, and Wikimedia Commons"

[edit]

From this conference abstract:[5]

"With over 50 million observations per year, iNaturalist is one of the world's most successful citizen science projects, uniting millions of people worldwide in observing, sharing, and identifying nature [...]. iNaturalist and Wikipedia have much in common: they are both collaborative, large-scale, open infrastructures made by volunteer communities with long-reaching impact on human knowledge. [...] To enable the seamless upload of iNaturalist images to Wikimedia Commons (which in turn enables their reuse on Wikipedia and other Wikimedia projects), this volunteer community has developed a diverse set of open source tools [...]"

References

[edit]
  1. ^ Dişli, Meltem; Candela, Gustavo; Gutiérrez, Silvia; Fontenelle, Giovanna (12 December 2025). "Open Data Practices of Art Museums in Wikidata: A Compliance Assessment". Journal of Open Humanities Data. 11 71. doi:10.5334/johd.438.
  2. ^ Pereda, Javier; Willcox, Pip; Candela, Gustavo; Sanchez, Alexander; Murrieta-Flores, Patricia A. (12 March 2025). "Online cultural heritage as a social machine: a socio-technical approach to digital infrastructure and ecosystems". International Journal of Digital Humanities. 7 (1): 39–69. doi:10.1007/s42803-025-00097-6. PMC 12202677. PMID 40584139.
  3. ^ Chhabra, Anamika; Setia, Simran (2025-09-25). "Investigating the evolution of Wikipedia articles through underlying triggering networks". Journal of Information Science 01655515251362587. doi:10.1177/01655515251362587. ISSN 0165-5515. Closed access icon
  4. ^ Matsui, Akira; Miyazaki, Kunihiro; Murayama, Taichi (2024-05-28). "Throw Your Hat in the Ring (Of Wikipedia): Exploring Urban-Rural Disparities in Local Politicians' Information Supply". Proceedings of the International AAAI Conference on Web and Social Media. 18: 1027–1040. doi:10.1609/icwsm.v18i1.31370. ISSN 2334-0770.
  5. ^ Lubiana, Tiago; Littauer, Richard; Leachman, Siobhan; Ainali, Jan; Karingamadathil, Manoj; Waagmeester, Andra; Meudt, Heidi M.; Taraborelli, Dario (2025-12-05). "Wiki Loves iNaturalist: How Wikimedians Integrate iNaturalist Content on Wikipedia, Wikidata, and Wikimedia Commons". Biodiversity Information Science and Standards. 6798855 - Advancing biodiversity goals from local to global scales using iNaturalist. Vol. 9. Pensoft Publishers. pp. –181155. doi:10.3897/biss.9.181155.
Supplementary references and notes:


Signpost
In this issue
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.




       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0