The Signpost

Technology report

Wikidata nears first deployment and wikis go down in fibre cut calamity

Contribute  —  
Share this
By Jarry1250

July engineering report published

In July 2012:

Engineering metrics, Wikimedia blog

The Wikimedia Foundation's engineering report for July 2012 was published this week on the Wikimedia Techblog and on the MediaWiki wiki, giving an overview of all Foundation-sponsored technical operations in that month (as well as brief coverage of progress on Wikimedia Deutschland's Wikidata project). All three headline items in the report have already been covered in the Signpost: Wikimania and the pre-Wikimania hackathon; the launch of new software to power a refreshed Wikimedia report card (named in the report as Limn); and the ongoing deployment of version 5 of the Article Feedback tool.

Among other developments noted in the report was work on what is now being called the "Page Curation" project, a package including Special:NewPagesFeed and an opt-in "curation toolbar" that recreates much of Twinkle's functionality as well as a number of other mechanisms for helping editors deal with page creations. In July, the report says, developers "completed development of all key curation tools and are now adding a couple final features ... [and] now plan to pre-release Page Creation on the English Wikipedia in mid-August — with a full release in September 2012". Elsewhere, the first Wikipedia Engineering Meetup was set for August 15. Held in the Wikimedia Foundation's home city of San Francisco, the meetups are an attempt to engage local programmers, of which there are many. The meetups are due to be held every two months, the report noted.

There was also mixed news with regard to site performance (see also related stories below). Performance Engineer Asher Feldman hit gold with an upgraded version of the parser cache server cutting the 90th percentile response time from 53.6ms to 7.17ms, and the 99th percentile response time from 185.3ms to 17.1ms, meaning that 99% of all page requests going through the cache are now served and sent back to the user in 17 thousandths of a second or less. Lead Platform Architect Tim Starling had less success, however, with his project investigating the possibility of optimising PHP processing at the bytecode level, which "looked like a promising direction for performance optimization". Unfortunately, despite a significant "theoretical gain ... actual performance [seemed] disappointing", causing the project to be suspended indefinitely.

Wikidata closes in on first deployment

The Wikidata logo, selected last month

Developers are closing in on a first deployment of Wikidata, it became clear this week. Phase one of the project, aiming to provide a central repository of interwiki links, is expected to launch on the Hungarian Wikipedia within weeks (wikitech-l mailing list).

Confirming that all major work on the project, which is split across four extensions, is complete, the past week and the next couple will be dominated by work getting code reviewed, Project Director Denny Vrandečić suggested in his post on the developers' mailing list, picking out seven actionable items that will need to be negotiated ahead of a first deployment.

After the Hungarian Wikipedia, where community members have already agreed to trial the extension, the extension is likely to be deployed to either the Italian Wikipedia or the Hebrew Wikipedia, where its right-to-left support can be scrutinised; next up will be the English Wikipedia and finally all other Wikipedias. Deployment of phase 2 with centralised infobox-style data is not expected until the end of the year, if not earlier next.

In brief

Signpost poll
Geonotices
You can now give your opinion on next week's poll: Would you consider installing a Signpost Android app?

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for several weeks.

At the time of writing, 13 BRFAs are active. As usual, community input is encouraged.
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
You'd think that, after all of these years and the supposed technological bias of Wikipedia contributors, we'd have an encyclopaedia around here somewhere that explained backhoe fade a.k.a. backhoe fading, including such things as warning label conventions that are supposed to mitigate it and explaining that despite the name it is taken quite seriously given that it's responsible for a significant fraction (a 1993 study said three-fifths) of physical layer outages in cable networks, wouldn't you? ☺ Uncle G (talk) 17:39, 7 August 2012 (UTC)[reply]

Clarification on the major outage: Apparently we were supposed to be allocated bandwidth on two different fiber-optic cables from the Tampa data center. The data center, however, erroneously put our 2 bandwidth allotments on the same cable, so when the cable was cut, we had no redundant connection between Tampa and Virginia to fall back to. Kaldari (talk) 22:32, 8 August 2012 (UTC)[reply]

  • Which inspires the question, why is there a countepart center in Virginia, if it is not able to take up the load when something goes wrong with the Tampa center or its cables? Do the wikis only work when both centers are alive, ticking properly, and properly connected together? Jim.henderson (talk) 09:53, 9 August 2012 (UTC)[reply]
  • At the moment, Ashburn only works when both datacenters are connected together. The database masters all live in Tampa and the apaches need to talk to them in order to serve content. LeslieCarr (talk) 18:20, 9 August 2012 (UTC)[reply]



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0