The Signpost

News and notes

Privacy policy debate gears up

Contribute  —  
Share this
By The ed17 and Andrew Gray

On September 3, the Wikimedia Foundation launched the second stage of the process to improve the privacy policy implemented on most Wikimedia sites, including Wikipedia, by publishing a policy draft.

The first round of deliberations started in mid-June with an open call for input, but was overshadowed by the PRISM debate. The overall aim is to replace the current, aging policy developed in 2008 by WMF's then-General Counsel Mike Godwin with one that accounts for changes in the legal and technological environment since then.

The second consultation broadly resembles the Terms of Use update in 2011–12, where more than 120 issues were examined over the course of several months. The legal department only released an English-language draft, while the new privacy policy draft was released in other languages as well: Arabic, French, German, Japanese, and Spanish.

An early controversy was sparked by the attempt—novel in Wikimedia contexts—to use illustrations and jokes as part of the draft in an effort to expand the audience able and willing to read through the legal documents. Geoff Brigham, the foundation's current General Counsel, said in the related Meta debate that early A/B tests displaying the department's mascot, Rory, on banners calling for input indicate a higher click-through rate than for the conventional Foundation logo—including a 9:1 increase on Japanese Wikimedia sites.

The privacy policy draft is the most important part of a series of ongoing and upcoming legal documents to be scrutinized by the community. Alongside the main draft, the WMF has published a proposal for the access to non-public information policy, governing rights and duties of CheckUsers, support team members, and others in handling a wide range of issues. Future plans include data retention guidelines; a spelling out of the Foundation's data collection and retention practices under the new privacy policy; and a transparency report disclosing, among other things, how often the Foundation is approached by third parties to hand over user information, the sources of these demands, and how often the foundation complies.

In brief

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.

Article creation bot finishes first run

Not only is Swedish Wikipedia now overflowing with hundreds of thousands of stubs based on outdated taxonomic information, but all of this outdated information is now being copied en masse to other language wikis and Wikidata, from which it will eventually work its way to English Wikipedia, polluting our hand-built mostly-up-to-date taxonomic data with boatloads of crap.

  • First rule of article creation bots: Never build article creation bots.
  • Second rule of article creation bots: If you're going to build articles based on 3rd party databases, only use the most specific, specialized, up-to-date databases possible, not huge, generalized databases that don't bother to keep their data up-to-date.

Kaldari (talk) 22:26, 6 September 2013 (UTC)[reply]

As an active Wikidata user, I'm always highly concerned with the prospect of bad information coming over to Wikidata. Where is the bot operator's plan for Wikidata published? Sven Manguard Wha? 04:27, 7 September 2013 (UTC)[reply]

I agree completely. Even with all the resources of English Wikipedia, we have hundreds of thousands of poorly maintained and poorly watched articles that were created either by bots, or by users in systematic ways. A smaller wiki has no chance of maintaining hundreds of thousands of micro-stubs. I think the root of the problem is inherent or assumed notability for certain classes of things, but as long as we have the flawed notability standards, we need to at least use discretion in proliferating these articles, with a mind for the resources required to maintain them as thousands of microstubs vs fewer summary style articles. Gigs (talk) 15:10, 9 September 2013 (UTC)[reply]
I agree that taxonomic bots create a lot of crap, look at this list of suspected duplicates on wikidata.
But, in defense of the swedes, their swedish-lakes-project is a rather good example for bot-generation of articles: Take several reliable sources, prepare, get consensus and generate articles that people can add to, more text and photos.
And i really like this: "Our next initiative we are working with is to get all data of Swedish communes, cities and towns 100 % correct in Wikidata (and also a semiautomatic update link to Wikidata from the Swedish statistical authorities databases). We thought our articles on these subjects were fine, but find we need to put in 6-9 month time to get the data from fine to 100% correct, and all the relevant data elements in place in Wikidata even if it only a few thousand articles . When we are ready we will have all the base data for these entities taken from Wikidata (not giving much improvement) but more important we will be able to provide 100% quality data for other language versions to semiautomatic get data (or generate articles) of these subjects, where we feel a special responsibility to secure global quality for." If other projects take the same initiative for polish, french, german,... communes, cities and towns, that would be a huge step for wikidata! --Atlasowa (talk) 08:54, 11 September 2013 (UTC)[reply]

Illustrations and jokes

The Japanese have made this, so I'm not surprised by the 9:1 ratio. I'm happy that the foundation chose to experiment with that. --NaBUru38 (talk) 05:58, 7 September 2013 (UTC)[reply]



       

The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0