The Signpost

Technology report

Bugs, Repairs, and Internal Operational News

Contribute  —  
Share this
By Theo10011 and Waldir

MediaWiki 1.17 deployment failed, postponed

The planned update of MediaWiki as the underlying software which forms the basis of WMF wikis to version 1.17 failed last week (Wikimedia Techblog). The original deployment was expected to begin 07:00 UTC on February 8 (see previous Signpost coverage), but preparations took longer than anticipated and actual deployment began at around 13:00 UTC.

Several issues became apparent almost immediately. The parser cache miss rate almost doubled with the new deployment, at which point the Apache servers, which are responsible for delivering content to users, became overloaded and started behaving unpredictably. The increased load culminated with multiple issues across the project from increased lag to even outage for some users. At this point, the deployment was rolled back to the previous 1.16 release. The tech team investigated and prepared for another attempt after resolving some technical issues. A second attempt was made at 16:27 UTC, but this ran into similar performance issues and had to be called off 90 minutes later. Further attempts were put on hold.

Danese Cooper, Wikimedia's Chief Technical Officer, blogged about the failed deployment and explained what the Foundation had attempted to deploy:

After further investigation and several fixes to the release, Rob Lanphier, a developer with the WMF, added that "some of the unsolved issues are complicated enough that the only timely and reasonable way to investigate them is to deploy and react". As a result of this, he said, a new plan had been drawn up in which 1.17 will be deployed on "just a few wikis at a time". The tech team believes the problem was located in the configuration of the $wgCacheEpoch variable, which caused a more aggressive culling of the cache than the servers could handle (Wikimedia Techblog).

The team decided on a two-stage deployment for their next attempt (reviving some old code for project-wise upgrading). The first phase took place 6:00–12:00 UTC on Friday, February 11. This was limited to the Simple English Wikipedia and Wiktionary; the Usability and Strategy Wikis; Meta; the Hebrew Wikisource; the English Wikiquote, Wikinews and Wikibooks; the Beta Wikiversity; and the Esperanto and Dutch Wikipedias.

At the time of writing, the deployment had been completed on all but the last two projects. The Hebrew Wikisource, included after a request from a community member, gave a chance to observe the deployment on a right-to-left language wiki. The team also reported some localization issues which triggered ParserFunction bugs on both nl.wikipedia.org and eo.wikipedia.org. The traffic from nl.wikipedia.org was enough at the time to cause a noticeable spike in CPU usage on the web servers, including some time-out errors; thus, deployment onto nl.wikipedia.org had to be delayed. After these issues are resolved , the second wave of deployment is expected to start on Wednesday, February 16 (see the current list of WMF wikis that are already running 1.17).

An IRC office hour Q&A was held on matters related to the ResourceLoader, which is expected to cause compatibility issues with some existing Javascript code. Trevor Parscal and Roan Kattouw, the main developers of the ResourceLoader, were available on IRC on February 14 at 18:00 (UTC) to answer queries related to the new feature.

In brief

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.

+ Add a comment

Discuss this story

To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
No comments yet. Yours could be the first!







       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0