The Signpost

Births and deaths

Wikipedia biographies in the 20th century

Contribute  —  
Share this
By Carcharoth

How are Wikipedia biography articles distributed in the 20th century? The following graph shows the number of births and deaths of people with Wikipedia articles for each year of the 20th century (1899 until 2010, actually) based on the birth year and death year categories.

Births and deaths of people with Wikipedia biographies, 1899 to 2010

What the data seem to show is that the number of births remains relatively steady (with a slow increase) until about 1935, and then (presumably) the effect of recentism and the large number of biographies of living people starts to kick in, and the number of people with Wikipedia articles born in the years after that point increases, with a spike in 1947 (from the post-war baby boom, perhaps?). Then it levels off and starts to rise dramatically from about 1970 onwards (this would be people who are about 40 years old), reaching a peak with people born in 1982 (28 years old). The figures collapse completely around 1990 when the age drops low enough that the subjects of the biographies are children, forming a tailing off that never completely disappears and only reaches zero in 2010 (no-one automatically notable has been born yet).

For deaths, there is a slow but steady increase from 1899 to about 1990, with two peaks that are clearly due to the two World Wars (the peaks are in 1918 and 1944). There is a massive increase in deaths between 1990 and 2009, with the peak in the last three years with all those years being over 4,500. The actual peak is 4825 deaths in 2008. The births peak in 1982 was 8577 for comparison. The deaths graph drops off dramatically at the end, because "only" 82 people with Wikipedia articles have died this year so far (though rather disconcertingly I see that has gone up to 98 in the space of a single day - the figures I have were taken on 11/01/2010). Or to put that another way: over the past three years, an average of 12 to 13 people with Wikipedia articles have died each day.

Trying to measure how fast people are "born" (i.e. appear on Wikipedia) is not so easy to calculate, as people have to become "notable" first, and they do that at different rates (some are notable when they are born, others take a bit longer, maybe not becoming notable until after they have died). As to the peak of births in 1982 and the decline after that, it is difficult to explain why that year, precisely, is the peak; probably several factors are at work there.

+ Add a comment

Discuss this story

These comments are automatically transcluded from this article's talk page. To follow comments, add the page to your watchlist. If your comment has not appeared here, you can try purging the cache.
  • Interesting study, it clearly shows the effects of recentism and war deaths. I've been writing some biographies on 70s theater personnel lately; most of their births were before 1940, showing that there are definitely gaps in our coverage with respect to less-recent people. I'd like to see this study done again in a few years to see how the distribution changes (if at all). --Cryptic C62 · Talk 04:24, 26 January 2010 (UTC)[reply]
    "Recentism" is not a problem unique to Wikipedia. Look at any collection of biographies, & you will find it skewed towards the more recent period. We favor people who make a personal impression, who are part of our living memory, which means the problem begins starting 3 generations back from the current date. One of Wikipedia's strengths is that we have no limits on space: as time goes on, there is no need to remove articles about people who were important in their day, but now have become of interest only to graduate students in need of a subject for their theses. Print encyclopedias have always needed to prune their content to keep their sizes manageable. Wikipedia may never rescue every subject of note that has been denied their proper due, but at least we can prevent more from being unjustly forgotten. -- llywrch (talk) 20:33, 27 January 2010 (UTC)[reply]
    It's more a question of what records survive from earlier, and the motivation and resources to write and maintain the articles. Carcharoth (talk) 09:06, 28 January 2010 (UTC)[reply]
    There is that factor, undeniably. However, whether a person makes some kind of personal impression is far more decisive -- which can be shown by a simple experiment. I think we'd all agree that the last three centuries, back to 1700, are fairly well documented: if someone wants to write a biographical article on some notable who lived in those centuries, the resources are there. Pick a year in the last 50, look at the category of deaths for that year, then compare the number to prior years at 50 year intervals back to about 1700: the drop-off in articles is astonishing. (I did it starting with 1990, in which category there were 2404 articles, & the drop-off went like this: for 1940, 1548; for 1890, 783; for 1840, 341; for 1790, 161; for 1740, 102; & for 1690, 94.) But, as I wrote, this failing is not unique to Wikipedia: to grab an example at random, half of Jerome's biographies in his De Viris Illustribus lived within 100 years of when he published the book -- circa AD 390. Dead people -- whether white males or not -- are just not that interesting to most people. -- llywrch (talk) 19:43, 28 January 2010 (UTC)[reply]
    I agree with you to a large extent, but you have to take the demographic transition, particularly in Europe, into account. Only if you hold these numbers up to demographic developments, can we really see how presentist the project is. Lampman (talk) 08:39, 29 January 2010 (UTC)[reply]
  • An even more active MilHist BLP output would increase the peaks in deaths for the two world wars. It's interesting that we have far less reach into the participants in WW1 than WW2. The peak in births in the early 80s owes much to the coverage of popular culture. Tony (talk) 05:44, 26 January 2010 (UTC)[reply]
    I don't think you mean BLP, because BLP stands for Biographies of Living Persons, but yeah.—greenrd (talk) 09:37, 26 January 2010 (UTC)[reply]
  • Fascinating. The spike from 1970 onwards presumably comes to a large extent from active and recently-retired athletes. Lampman (talk) 23:10, 26 January 2010 (UTC)[reply]
    Plus Actors, beauty pageant winners and popstars. ϢereSpielChequers 00:50, 28 January 2010 (UTC)[reply]
    All contemporary culture, effectively. Carcharoth (talk) 09:06, 28 January 2010 (UTC)[reply]
    Beauty pageant winners make up a tiny percentage, and actors don't have such a definite cut-off point. Pop stars...yeah, maybe. Lampman (talk) 05:26, 29 January 2010 (UTC)[reply]
  • a bit OT, but what's with the DD/MM/YYYY date? ;) Interesting article. I wouldn't really say that there's anything meaningful which could be drawn from it outside of the context of Wikipedia, bit it's at least an interesting tidbit of knowledge. Thanks for taking the time to compile the numbers and write it up. (and PS.:There's absolutely nothing wrong with a focus on pop culture. There's a reason that it's "popular culture", you know. If it bothers you that there is more pop then not, quit complaining and start writing! Complaints about athletes, television shows, and Pokemon became tedious years ago. We're steadily becoming stodgy around here.)
    V = I * R (talk to Ohms law) 09:39, 28 January 2010 (UTC)[reply]
  • Thanks! I've rerun the stats again with five languages, rather than three - there's a more detailed discussion of what the results might imply here. Shimgray | talk | 17:40, 29 January 2010 (UTC)[reply]
  • As a response to Lampman's comment about "fewer child prodigies", there is another factor to compensate for in these stats: a number of missing birth dates, which is a significant number as late as the 1950s -- even in the developed world. (Births at home, not in a hospital or medical clinic, was not unusual in the US into that decade.) Articles about people from the lower socioeconomic classes -- actors, artists, businessmen -- are more likely to lack dates/years of birth. As a result, there is a difference between the total number of births & deaths for any generation, & the value of this difference increases as one goes back in time. -- llywrch (talk) 06:08, 31 January 2010 (UTC)[reply]
  • Mmm. And, of course, there's Category:Year of birth missing (living people) - 48,000 people whose years of birth we don't know, but who are presumably distributed somehow from 1920ish to now. Conversely, though, we have the effect of people who are well documented (ish) until they "drop off the record"; we can be reasonably confident they're dead, but we have no idea when it happened. There's around 11,000 of these, either "definitely unknown" or just not yet listed. Shimgray | talk | 16:46, 31 January 2010 (UTC)[reply]

Someone asked how many of these biographies get created because of media attention around their death, so I took it upon myself to look at article creation on dead people directly after their death. A look at deaths in January 2009 shows 13.26 deaths a day, which corresponds quite well with the above numbers.

I then looked at the month from 29 December 2009 till 28 January 2010 (I excluded 12 January, because of the extraordinary high death tolls from the Haiti earthquake – 32). This gave me an average of 12.76 a day, which is not too far below the number from January last year. However, this contains some red links, and these get deleted after one month (the ones from 28 December have already been culled). Of my sample, 2.83 were red links. That leaves 9.93 live links per day; 3.33 less than the average from last year. So accordingly, 3–4 biographies on dead people – a day – will have to be created over the next year to get to the normal level. I’ll try to also put up a graph of how the blue-to-red ratio moves over the course of the month. Lampman (talk) 06:48, 29 January 2010 (UTC)[reply]

Here's the chart. Lampman (talk) 07:42, 29 January 2010 (UTC)[reply]
  • Thanks for all the responses, in particular those looking at the trends in other languages, and the various blog posts that have resulted from this. Even more fascinating than I first thought! Carcharoth (talk) 23:41, 29 January 2010 (UTC)[reply]


The Signpost · written by many · served by Sinepost V0.9 · 🄯 CC-BY-SA 4.0