The Signpost

Technology report

Next Wikimedia deployment already in the pipeline and details of recent performance improvements

Contribute  —  
Share this
By Jarry1250

1.20wmf1 in the pipeline

A provisional timetable was released for the first mini-deployment on Wikimedia sites from new version control system Git. Beyond the technical challenge of releasing against a substantially different code management background, the release should also herald the start of a new era of far quicker deployments to Wikimedia wikis—itself a development that has been discussed off-and-on for at least eighteen months (Signpost coverage).

With only a couple of months' worth of changes included (most of them simple bug fixes), it would be easy to overlook the release. Such a mistake is unlikely to be made by any developer aware of its historical significance, however, it being the quickest deployment of a block of changes (as opposed to individual "emergency" merges) in nearly two and a half years. The release thus marks the retirement of the previous paradigm – merging a small subset of revisions but retaining the majority for irregular watershed deployments – in favour of a new model focused on regular "mini"-deployments. Only time will tell whether or not Aryeh Gregor (quoted above in October 2010) will be proven right, and whether the problem of the volunteer-staff average review time divide (a problem which flared up once again on the wikitech-l mailing list only this week) will be settled once and for all.

According to the plan published on wikitech-l this week, non-Wikipedia sites (that is to say, Wikimedia Commons, Wiktionary, Wikisource, Wikinews, Wikibooks, Wikiquote and Wikiversity) should receive the update on April 16, with the English Wikipedia following on April 23. Should the deployment go well, the remaining wikis will be updated on April 25, just two months after they enjoyed the benefits of 1.19.0, which will only complete its own release cycle when it is made available to external sites later in the month.

March Engineering Report published

In March 2012:
  • 98 unique committers contributed code to MediaWiki.
  • About 34 shell requests were processed.
  • 82 developers gained developer access to Git and Wikimedia Labs, of which 71 are volunteers.
  • Wikimedia Labs now hosts 75 projects, 126 instances and 222 users.

Engineering metrics, Wikimedia blog

The Wikimedia Foundation's engineering report for March 2012 was published this week on the Wikimedia Techblog and on the MediaWiki wiki, giving an overview of all Foundation-sponsored technical operations in that month. March was dominated at first by the deployment of MediaWiki 1.19 to all Wikipedias and more latterly by the move to Git and its associated code review system Gerrit. Other headlines from the month will also be familiar to regular Signpost readers, including the completion of the move of all Wikimedia domain names away from registrar GoDaddy in protest at their political stance on SOPA; the publication of a report on the first phase of Article Feedback version 5; and design improvements to the mobile front-end.

As is often the case, many of the changes that came in under the radar relate to incremental performance improvements aimed at allowing Wikimedia to support a rapidly increasing audience. For example, significant work was done with regard to preparing the newer Ashburn data centre to share responsibility with its Tampa counterpart for internal search functionality. Attempts to improve image caching were stymied by lingering concerns about "overloading the NIC cards and the risk of concentrating too much cache on each server", yielding only a trial improvement thus far. Network peering was also added to the Ashburn site, allowing it to pool resources with a dozen or so websites and ISPs—a move expected to reduce latency for users in Europe, Japan and Hong Kong. Similar motivations also led the Foundation to begin investigating the possibility of establishing a caching centre on the West Coast of the United States, the report said. Meanwhile, the switch in default thumbnail handling system to Swift finally settled down during the month after numerous problematic attempts at deployment during February; the same system is now expected to start handling non-thumbnailed images sometime in late May.

Elsewhere, it was announced that Wikimedia Labs' main per-project storage space (71,000 GB, currently distributed in 300 GB chunks) came online during March, though there were also two labs outages during the month. In addition, the Visual Editor team have now finalised a decision to base the new WYSIWYG editor around the contentEditable HTML5 property, having previously worked on a separate "editsurface" system in parallel, paving the way ahead towards a summer release. Finally, the first release of a complete copy of the English Wikipedia in the specialised ZIM file format (containing about 4 million articles, 11 million redirects, and 300,000 mathematical images) was also completed during March; the hope is to use regularly generated ZIM files – viewable with the WMF-supported Kiwix reader – to provide a complete offline browsing experience in the so-called "global south".

In brief

Signpost poll
GSoC priorities
Vote now on next week's poll: If you were the editor-in-chief of the Signpost, would you continue to include opinion polling in one or more reports?

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.

Additionally, Rcsprinter123 has nominated himself for BAG membership; the community is encouraged to join that discussion, as well as those relating to the 16 currently active bot approvals.
An example of the organisational charts recently produced by contractor Mark Holmquist
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • A small correction about Labs and storage: Labs has two storage clusters. One we've had since the Labs launch, which is "instance storage". Instance storage is for storing instance images (virtual machine images). The second cluster was the one added recently, which we call "project storage". Project storage is accessible from within instances, and is divided from a security perspective by project (hence project storage). We had a couple outages due to the instance storage, not the project storage. A full outage of the project storage would cause issues with data access in Labs, but even a small outage of the instance storage will cause major issues for all of Labs, since the instances (virtual machines) would lose access to their local disks and would crash (like a server's disks dying).Ryan lane (talk) 00:58, 10 April 2012 (UTC)[reply]
    Ah, thanks for the correction Ryan. I admit I completely missed the project/instance division. I've tweaked the wording used above accordingly. - Jarry1250 [Deliberation needed] 08:32, 10 April 2012 (UTC)[reply]
  • Just a question: how many datacenters are there and where are they located? Night of the Big Wind talk 11:46, 10 April 2012 (UTC)[reply]
    Unless I'm mistaken, there are two "proper" data centres: one in Tampa, Florida and one in Ashburn, Virginia, which is a relatively recent addition. There's also a caching centre in Amsterdam to help European audiences. There used to be additional caching facilities in Seoul and Paris for some years, though hosts at both locations were later decommissioned (presumably because their benefactors were no longer able to maintain them). - Jarry1250 [Deliberation needed] 11:51, 10 April 2012 (UTC)[reply]
  • Adjusted two instances of "Wikimedia" to "MediaWiki" when it was clear the reference was to the software. This includes the headline. Nathan T 14:23, 10 April 2012 (UTC)[reply]
    The headline is fixed at publication, so I'll have to change that back (it gets reprinted elsewhere). I'd argue it was merely ambiguous rather than wrong: "Wikimedia deployment" here being shorthand from "deployment of the latest version of MediaWiki to Wikimedia sites". I've preserved your clarification of the prose, naturally. - Jarry1250 [Deliberation needed] 16:54, 10 April 2012 (UTC)[reply]
    I suppose... the point is that "Wikimedia" isn't something that gets deployed, and the software MediaWiki is used in a variety of other settings. MediaWiki to Wikimedia sites would be conceptually correct; Wikimedia to Wikimedia sites, aside from not making sense, promotes a misunderstanding of the relationship between the Wikimedia projects and the software platform. Nathan T 19:39, 10 April 2012 (UTC)[reply]
    I agree that "Wikimedia deployment" is hardly ideal, but nothing else proved short enough for a headline :) I'm not sure any regular reader would not understand the MediaWiki-Wikimedia divide, however. Anywhoo, not worth arguing over. T'is done. - Jarry1250 [Deliberation needed] 20:32, 10 April 2012 (UTC)[reply]
  • Regarding the proposal to replace gerrit, I fail to understand where "the initial reception was largely negative" comes from. Reading the thread, it seems that all but one of the responses were actually quite encouraging and commending (at most, somewhat cautious). --Waldir talk 16:40, 13 April 2012 (UTC)[reply]
    • Well, it depends what you describe as negative and what as positive. Personally, I would say that a positive response would have been "you've got my/our full support", and that the actual response was (rightly or wrongly) quite a long way away from that. But yes, perhaps "mixed" would have been better. - Jarry1250 [Deliberation needed] 13:08, 15 April 2012 (UTC)[reply]


The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0