The Signpost

News and notes

One decade of Wikisource; FDC recommendations raise serious questions

Contribute  —  
Share this
By The ed17 and Tony1
Editor's note: last week's issue, which would have been published on 27 November, was skipped.

Wikisource turns ten

The sister project Wikisource, the digital library that hosts free-content primary sources, is now a decade old. Wikisource, which now has versions in 63 languages, is the sixth type of project to reach its ten-year milestone and will be the last until 2016.

Working on Wikisource is fundamentally different from Wikipedia. Most editors first start by uploading a pdf or djvu file of a source work; there is no notability standard required beyond it having been professionally published, and the Proofread Page extension gives Optical character recognition-based text that has to be proofread. Translations of these works and author bibliographies are also accepted, while original writings are delegated to Wikibooks. The project also offers interwiki links to relevant articles in the Wikimedia-verse, annotation, different editions of the same works, metadata, and ease of classification.

Highlights on the English Wikisource include items as varied as poetry, laws, constitutions, US Supreme Court decisions, modern novels, short stories, children’s literature, science fiction, and scientific papers. Wikisource also has extensive author indexes and featured texts such as A Jewish State (1896; 1917 translation).

Project Sourceberg, as Wikisource was first known, arose in 2003 because of edit wars on the English Wikipedia over the inclusion of primary sources. The name did not last long; several subdomains and a vote later it was renamed "Wikisource". The project has since developed its own community and has forged collaborations in its own right with prestigious institutions such as the US National Archives and Records Administration and organized the transcription of major portions of very large works like the Dictionary of National Biography and Popular Science Monthly. There are 61 active wikisource projects, and two closed projects. Haitian was closed because it was a tiny jumbled mess. Old English Wikisource closed because it is a dead language.

John Vandenberg has had an active presence on the English Wikisource for many years. He told the Signpost that among the strengths of Wikisource are its simplicity of use for new contributors, and that disputes are rarely about content, the bane of Wikipedia politics. "Instead, community debates tend to have concerned stylistic faithfulness to the original—or more technically, the provenance of the material."

Vandenberg says that many contributors are dedicated librarians and archivists. "Some ten multilingual users travel between the main versions—the English, the French, and the German Wikisources—providing at least some cohesion between the sites", he points out. The French site has historically emphasised reader-friendliness, with much attention given to the look of the pages. The German site has been more concerned about faithfulness to sources, and it was that project that first introduced the technology Proofread Page, in 2008, which allows much more control over the uploading of text and images of a range of file-types; at the same time, the German community banned what had become the mainstay of Wikisource uploads on all language versions: what is colloquially known as "dumping". The English site still allows dumping, but encourages the use of the new technology. Interestingly, he says, this occurred at around the same time that the main Wikipedias started insisting on the proper verification of claims in articles.

A significant challenge nowadays, says Vandenberg, is textual criticism—adding annotations to a text—which needs developer input to integrate it into the wiki system. "There's a good application called TEI (Text Encoding Initiative) for academics that allows contributors to add a semantic layer on top of raw text; but it needs to be made compatible so that it maintains the features of a wiki and at the same time doesn't become too complicated for new users."

Having met the major milestone of a ten-year anniversary, Wikisource editors have been commemorating it with a proofreading contest; this includes prizes for the winners funded by the UK Wikimedia chapter. Over this long period of time, lessons have been learned, and there have been major accomplishments—but what does this achievement mean to the editors who work there, and where will they go from here?

AdamBMorgan points to the Dictionary of National Biography and Popular Science Monthly transcriptions as major victories for Wikisource, but believes that the site must "de-mystify" itself to the general public. Inductiveload added the 1911 Encyclopædia Britannica as a major achievement, though that gigantic reference work is also not fully transcribed yet. Acélan, from the French Wikisource, noted that all 16,000 pages of the famous Encyclopédie are completely transcribed there, only needing to be validated.

The future of Wikisource appears bright. Tpt sees the coming introduction of the VisualEditor as a potential point of success for the small project, noting that it will "make very easy for anyone to proofread" and facilitate the introduction of an export tool with "the adoption of a powerful metadata management system based on technologies built for Wikidata". Zyephyrus put it more succinctly: he sees the future as whether or not the project will complete its mission of "the complete library accessible to all humans on Earth."

In addition, the new Wikisource Community User Group was recently approved by the Affiliations Committee. The group plans "to support the Wikisource community in international communication tasks, outreach to external groups, coordination of software tools development, and facilitate fundraising according to its member needs", but what do regular users of the site think? Remarking on one of Wikisource's largest stumbling blocks, Viewer2 wonders if in trying to help and "inject some kind of sanity into the copyright strait-jacket", the organization "might just [be] occupied forever". John Carter hopes that it can help publicize the little-known site; if new editors come in bringing transcriptions of, for example, local and regional histories, that could be just the niche that Wikisource can fill and thrive in.

The site's contributors are upbeat, too: Maury, who is retired in real life, told us that it was a question of doing good for others, not just yourself. "Why carry knowledge to the grave when it, like real life itself, can be applied to building to better the world?" And has the site reached its full potential? As Carter stated, "The scope of this site is, really, only limited to the scope of the printed word and other historic works."

FDC recommendations raise questions about clarity of metrics, rationale

The FDC's third six-monthly round of annual grants: what the applicants asked for (blue) and what they are likely to get (red), both calibrated on the left vertical axis; the percentage of their bid that the FDC will recommend (transparent bars) is calibrated on the right axis.

The Wikimedia Foundation's volunteer Funds Dissemination Committee has published its recommendations to the Board of Trustees on 11 new applications for annual grants by 11 WMF-affiliated organisations. The announcement comes after the FDC-related staff revealed their assessments and comments on the applications last month. The maximum total budget for the current and upcoming March rounds is US$6M. In this round $4.4M has been recommended, leaving a maximum of $1.6M for the second and final round in 2013–14. The FDC reports that a total of $1.4M is likely to be requested in March.

Most returning applicants received significant increases over last year's allocations, despite the FDC's concerns about rapid growth in budgets and staffing, underspending, and planning. In particular, the staff ratings in this round were sharply reduced compared with those a year ago for four returning chapters—the UK, Germany, Switzerland, and Israel—the first three of which are large European entities. There has been debate about value for money in the traditional chapter model, with warnings by the Foundation's executive director, Sue Gardner, that the FDC is "disproportionately chapters-centric", and her questioning of the cost–benefits of "setting up bricks-and-mortar institutions ... alongside sometimes difficult dynamics between staff and community".

Evolving context

The current round is occurring in a changing environment for funding. This is throwing up challenges for a multilayered, intricate system that is little more than a year old and is likely to factor into how the FDC, and WMF grantmaking more broadly, evolve over the next few years. Affiliated organisations are now returning for a second annual grant, which was always going to bring into serious play what is known as the "guardrails" guideline. Spelled out in the FDC's framework, this specifies that from year to year an applicant's funding should be within the range of 80–120% of their previous year's funding; this is for the sake of stability in both affiliate organisations' finances and FDC outlays. In the FDC's first year, the guideline was loosely based on the amount of WMF funding applicants had received in the previous year through other means. Likewise, this year the benchmarks for Serbia and India, newcomers to the FDC process, were established on the basis of non-FDC Foundation grants for the 2012–13 financial year.

At a high-profile WMF Metrics Meeting just before the deadline for applications, FDC support staff raised concerns that most of the bids for the current round were well over the maximum 20% increase allowed under the guardrails guideline; only the Netherlands' bid was within the allowed increase, at a full 20%. Our reporting of these figures prompted one chapter to email complaints to the Signpost's editor in chief that the cited increase in their application bid was distorted by fluctuations in the US dollar exchange rate; we understand that these complaints were taken up with FDC staff.

A turbulent year for some chapters has also called into question how accountable FDC funding should be in relation to standards of governance and transparency. There have been further conflict-of-interest issues for the WMUK board, despite the joint WMF–WMUK inquiry into governance in the chapter last year in the wake of Gibraltargate. There appear to be electoral irregularities and conflict-of-interest problems concerning the board of the Indian chapter. And the management of the German chapter received a scathing report by the chapter's auditors concerning financial procedure and a lack of detail in the annual plan.

Complications: the guardrails, exchange rates, and underspending

The Signpost faced difficulty in comparing how the chapters had been treated in relation to each other, to last year's funding, and to the FDC's written assessments. It appears that the figures are complicated by two factors. The first is the exchange-rate issue. The FDC's statement about this is unclear—that recommended funding is now "in requested currencies; the amount in US dollars is for comparative purposes only (using recalculated conversion rates from 1 October 2013)". When we queried what this means, FDC member Anders Wennersten confirmed that local currencies were used in applying the guardrails guideline. The figures supplied to the Signpost—in local currencies—do not include the exchange rates used to arrive at last year's funding as the benchmark, and seem to involve other factors as well.

The second complication is that several applicants significantly underspent their FDC allocation in the 2012–13 financial year—the subject of repeated criticism in the assessments (the word "underspend" and varieties appear 10 times in the FDC's recommendations). The FDC's comments about the German chapter (WMDE), for example, are highly critical: "WMDE does not propose any clear solution to the fact that it has a significant carry-over of $675,000 from its 2013 budget. Briefly stating that it plans to allocate this amount to software development in 2014 is insufficient. The amount proposed is equal to the annual budget of several Wikimedia organizations combined and cannot be treated lightly. ... [WMDE] often chooses to rely on a more general and enigmatic overall outcomes assessment, which is somewhat problematic for an organization this size". ... This large requested amount of two million US dollars ["$2.4 million" in the next sentence] does not have a clear rationale."

The Signpost initially assumed that WMDE's funding has been cut by 2.2% from last year's grant of in straight US$ terms ($1.75M vs $1.79M). In contrast, the FDC's recommendations cited "an effective increase of 20% over the previous FDC allocation". Information provided to us by the FDC cites a change of −6%. Wennersten told us that "we have an unresolved issue with operating reserves". Last year, for example, Wikimedia Germany underspent FDC funding by US$225,000 (a calculation that had to be teased out of the chapter's total underspend from all sources of $665,000). In practice, Wennersten said, the FDC expects WMDE to finance their 2013–14 activities partly from that $225,000; however, it is still unclear how this was factored into the chapter's allocation this year.

We put it to FDC chair Dariusz Jemielniak that the Committee had been staunchly critical of WMDE and that this did not seem to match the funding allocation to the chapter. His response was twofold:

The underspend situation is yet more complex, according to FDC member Sydney Poore, who told the Signpost that:

Given the multiple factors involved, we are unable at this stage to provide a graph showing how each applicant's funding related to the 80–120% guideline.

In brief

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • WMIL also had its budget effectively reduced because in 2012–13 it spent its allocated budget + about 15% in carryover from the previous year, i.e. 115% of its budget. This is more than the budget it received in 2013–14 and therefore needs to cut back on programs. The problem here is that the FDC structure has much more stringent reporting requirements than pre-FDC, which means that chapters need to hire staff to handle these things. This automatically and very significantly raises the operating expenses of each chapter (for small chapters usually above 100%). This eats into each chapter's ability to actually carry out programs that benefit the movement, which in turns gets criticism from the WMF. If the WMF wants chapters to have both a very high level of reporting and the ability to carry out great programs, it needs to take into account the added expenses incurred by its own recent requirements. —Ynhockey (Talk) 09:13, 5 December 2013 (UTC)[reply]
    Hi Ynhockey, I want to stress that WMIL has its FDC allocation INCREASED by nearly 30%. Your allocation from the FDC last time was 549440₪, and now it is 709000₪. Thus, WMIL has received 29% more than it did the last time (effectively, minus 1.48% due to minimal inflation). Naturally, it is possible, that if your budget in 2012-2013 was inflated by a one time carryover, even the ~30% increase in funding from the FDC does not cover your intended growth. Pundit|utter 12:06, 5 December 2013 (UTC)[reply]
  • It is actually not true that "most editors start" on Wikisource by uploading a djvu. I have been on Wikisource for four years, involved in a major project you mention, and never have I uploaded a djvu, a procedure that requires some skill with large files in the typical case. The point is that uploading is far from being the bottleneck: proof-reading is by a very large margin the key part of getting a work available on Wikisource. Charles Matthews (talk) 10:09, 5 December 2013 (UTC)[reply]
  • I'm disappointed that you're only linking to the WMF blog post for the German court decision, considering that it paints a considerably rosier picture than just about anyone else does about the situation, and it's a major issue that could use a prose summary by the Signpost. Sven Manguard Wha? 17:04, 5 December 2013 (UTC)[reply]
    • This is where we could have done with another writer/researcher. We were unable to cover the matter properly, but decided it was sufficiently important to warrant a mention in the "In brief" section. To cover it properly would require in-depth research; I sense a different take by the WMF to that of the court. Interesting, but we received opinion that in the grander scheme it's unlikely to make a signficant difference to the question of liability. Tony (talk) 12:48, 6 December 2013 (UTC)[reply]
  • I'm surprised to see the FDC guardrails characterised as a rule here - they are a guideline, not a rule. Thanks. Mike Peel (talk) 20:54, 5 December 2013 (UTC)[reply]
    • You're quite right; I've changed it accordingly. I do note that the framework document says that "the FDC will follow" it, although the possibility of exceeding the range is specified as requiring stronger justification to the board of trustees. Tony (talk) 12:48, 6 December 2013 (UTC)[reply]
  • Steffan Prößdorf -> Steffen Prößdorf Cheers, —DerHexer (Talk) 01:06, 6 December 2013 (UTC)[reply]
  • Some clarification from the FDC on WMDE: There is indeed a difference between WMDE's "carry forward" (the $675,000 USD) outlined in its proposal and a general issue of (potential) chapter underspends from previous FDC allocations. As you know, we do not yet know the underspends of the chapters--since they are still carrying out activity with their current funds through the end of December. They have only loosely predicted what their underspends *might* be. That is why the FDC will need to publish their guidance on this issue separately. However in WMDE's case, WMDE is already saying it won't use those particular funds because the activity has been canceled, and instead, WMDE will apply the funds next year. WMDE may have significant underspend in addition to this--time will tell. Pundit|utter 18:26, 6 December 2013 (UTC)[reply]
    • This was all very confusing for an onlooker who wants to make sense of how the trajectory of FDC funding is playing out. I wish there had been more explanation, and inclusion in the summaries for each recommendation the predicted size of underspend on which the FDC based its calculations. Was this a predictable scenario—that underspends and therefore calculations of percentage change over the previous allotment difficult to arrive at at the time of recommendation each year? Tony (talk) 02:18, 7 December 2013 (UTC)[reply]
  • the German community banned what had become the mainstay of Wikisource uploads on all language versions: what is colloquially known as "dumping". The English site still allows dumping, but encourages the use of the new technology. What is dumping? It's not clear to me in the text. Chris Troutman (talk) 21:19, 7 December 2013 (UTC)[reply]
Dumping, as far as I can make out, is the practice of uploading raw sources, without accounting for provenance. Tony (talk) 14:11, 9 December 2013 (UTC)[reply]
Many Wikisources allow copying text from, for example, Project Gutenberg and pasting it directly into the mainspace. On English Wikisource this is not the preferred method but it is acceptable if there is some attribution of the source on the talk page. (This was actually the main way it worked in the early days.) German Wikisource does not allow this at all and requires that all of its text is based on scans (policy comparison). In contrast, about 28% of English Wikisource has been proofread from scans using the Proofread Page extension. - AdamBMorgan (talk) 15:21, 11 December 2013 (UTC)[reply]

Inflation

"Wennersten confirmed that local currencies were used in applying the guardrails guideline."

Don't forget inflation. It nears 30% annually in Argentina, for example. --NaBUru38 (talk) 23:52, 8 December 2013 (UTC)[reply]

That is economic suicide. I don't know to what extent the FDC should shoulder 100% of the uncertainty of fluctuations in exchange rates and inflation rates. The FDC has a fixed budget in $US, and it doesn't seem fair to other applicants that a large margin of uncertainty should have to be built into allocations. It's not an easy problem, but the practices of other international bodies might be instructive. Tony (talk) 14:11, 9 December 2013 (UTC)[reply]



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0