The Signpost

Technology report

On the indestructibility of Wikimedia content

Contribute  —  
Share this
By Jarry1250

WMF wiki content now almost indestructible


The content of Wikimedia wikis has recently moved significantly closer towards indestructibility, it was announced this week by WMF developer and data dumps specialist Ariel Glenn.

Masaryk University, in the Czech Republic, is one institution now mirroring Wikimedia dumps.

Specifically, data from all Wikimedia wikis is now being successfully replicated to three non-WMF sites around the globe: C3L in Brazil, Masaryk University in the Czech Republic and the servers of Your.org in the United States. Each site holds ("mirrors") at least five monthly snapshots ("dumps") of the publicly available wikitext-based content of all of the many hundreds of Wikimedia wikis. Your.org also hosts a copy of all previous dumps and will hold a single snapshot of all publicly viewable media. Moreover, Glenn reports, "getting the bugs out of the mirroring setup [has made it] easier to add new locations" as well as providing the latest snapshots to already established mirrors. As reported then, the first dump mirror came online in October last year, but this is the first time so many have been available concurrently.

Increasing the number of mirrors—made possible by the free licensing of Wikimedia wikis—helps to ensure that content is sufficiently accessible and geographically diverse to survive natural and artificial disasters; while multiple websites do host live copies of the English and other major Wikipedias, dump mirroring is particularly useful for protecting the content of smaller wikis, which do not enjoy such protection; the same used to be the case of the English Wikipedia, whose 2001 articles were long thought to be lost until old backups were uncovered in December 2010. Theoretically, dump mirrors could also offer better download speeds at times of peak usage, but that is unlikely to be a primary use case for Wikimedia wikis.

Of course, not everyone is so concerned at the possibility that Wikimedia's content might be destroyed in the immediate future, dump mirrors or no dump mirrors. As WMF Lead Platform Architect Tim Starling commented in a 2011 discussion of forking Wikipedia, "the chance of [WMF financial collapse] appears to be vanishingly small, and shrinking as the Foundation gets larger. If there was some financial problem, then we would have plenty of warning and plenty of time to plan an exit strategy. The technical risks (meteorite strike etc.) are also receding as we grow larger". That discussion focussed rather less on the technical aspects of making Wikimedia content indestructible, and more on allowing separate communities to emerge if Wikimedia communities broke up.

In brief

Signpost poll
Bugzilla
You can now give your opinion now on next week's poll: Which of the following do you consider the greatest threat to Wikipedia?

Not all fixes may have gone live to WMF sites at the time of writing; some may not be scheduled to go live for many weeks.

  1. JYBot, modifying, adding and removing interwiki links. At the time of writing, 16 BRFAs are active. As usual, community input is encouraged.
+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

I know I'm going to regret asking this, but how does the WMF growing larger reduce the risk of a meteorite strike? Kaldari (talk) 04:37, 22 May 2012 (UTC)[reply]

Expanding to multiple locations reduces the possibility of a natural catastrophe or other major disruption in one location resulting in permanent major data loss... AnonMoos (talk) 05:29, 22 May 2012 (UTC)[reply]
Ah, I was thinking people rather than servers :) Kaldari (talk) 06:54, 22 May 2012 (UTC)[reply]
AnonMoos, that's an annoyingly logical answer. How terribly disappointing. :) You couldn't have trolled Kaldari just a little? Philippe Beaudette, Wikimedia Foundation (talk) 07:07, 22 May 2012 (UTC)[reply]
Totally agree Philippe, it should have been evident to Kaldari that the extra glow from the servers with more electrons spinning faster at different places is obviously going to cause a small percentage of meteorites to be deflected by this increased charge; alternatively they will have a committee meeting, note our increased vigilance and after weeks of debate, an RFC and a vote, reach a true consensus and decide to bombard some other planetoid. I really surprised that the two obvious scenarios needed to be better elucidated. — billinghurst sDrewth 13:42, 23 May 2012 (UTC)[reply]
Expanding the dissemination of knowledge increases societal preparedness against meteor catastrophe by increasing the likelihood that the world will produce the educated sorts of people who could avert or lessen extraterrestrial crisis. When anyone contributes anything to any WMF project, world access to information increases, and thus the educated base from which meteor experts come also increases. Blue Rasberry (talk) 18:50, 22 May 2012 (UTC)[reply]
As Wikimedia projects expand with ever-increasing quantities of cruft, discerning meteors will turn their attention elsewhere. ~ 66.81.244.216 (talk) 19:35, 22 May 2012 (UTC)[reply]
Simples. Per WP:NASTRO we are redirecting many of the minor planets. Rich Farmbrough, 14:24, 24 May 2012 (UTC).[reply]
I wonder, are these dumps/forks accessible to the public or just stored on the server cluster? --Nathan2055talk 00:02, 23 May 2012 (UTC)[reply]
http://dumps.wikimedia.org is the (publicly accessible) Wikimedia site; I tried the HTTP versions of the dump mirrors and they seem to be public too (as one would expect). Not sure about older dumps nor FTP accessible credentials but I suspect both are conducive to public access. - Jarry1250 [Deliberation needed] 10:44, 23 May 2012 (UTC)[reply]

I'm going to speculate that the encyclopedia written by humans, for humans, won't be much use if there aren't any humans around. Still, this does raise the idea that we should see if we can get a full Wikipedia dump placed on board the next moon landing mission. That way the survivors may be able to recover the information in a few millenia. Or is that just too daft? Regards, RJH (talk) 18:43, 23 May 2012 (UTC)[reply]

You may also want to look through WP:TERMINAL for some ideas. Kaldari (talk) 18:53, 23 May 2012 (UTC)[reply]
someone should send a dump to Millennium Seed Bank Project. SYSS Mouse (talk) 21:29, 23 May 2012 (UTC)[reply]
  • The idea that WMF is "safer" as it gets bigger is fallacious. When it was an $800k per annum organization it was unlikely to fail to raise the required funds, it was not a viable target for lawsuits designed to make money, and the whole system was amenable to being "phœnixed" for pocket money sums. Of course this idea that large is strong is long-standing, but we can cite (recently) GM, Ford, Fannie Mae, Enron, Telewest, many large banks and even sovereign governments that have either gone bust or needed rescue. The good prognosis of the projects as they exist is probably primarily due to the open licensing. Rich Farmbrough, 13:13, 24 May 2012 (UTC).[reply]
  • So are there also regular offsite dumps of Wikimedia Commons? This might be the achilles heel, many templates rely on images and most articles will look spartan without images and sound and video files. Targaryenspeak or forever remain silent 18:20, 26 May 2012 (UTC)[reply]
I checked when the last dump of enwiki was and unfortunately the last one was done in 2010. I think we should raise the dump creation interval for enwiki to, say, once a year or so. --Nathan2055talk 20:39, 29 May 2012 (UTC)[reply]
Hmm? Or did you mean images? - Jarry1250 [Deliberation needed] 20:43, 29 May 2012 (UTC)[reply]



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0