More on birthday probabilities

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last week, Joe Rickert used R and four years of US Census data to create an image plot of the relative probabilities of being born on a given day of the year:


Chris Mulligan also tackled this problem with R, but this time using 20 years of Census data from 1969 to 1988. Chris extracted the birthday frequencies using Google BigQuery, and charted the results with the time series below using this R script.

Birthday probabilities

My apologies to Joe, but I much prefer this representation to the heat map. Not only is the February 29 frequency multiplied by 4 (where we see that it's not a particularly surprising birthday to have given the overall seasonal trend), but the unusual days really stand out (and are annotated). You're relatively unlikely to find someone born on January 1, July 4 or Christmas Eve or Christmas Day (most likely because fewer Caesarian births happen, or more induced natural births are avoided, on those days). December 30 is a more likely birthday that you'd otherwise expect (maybe this has something to do with getting kids into an earlier school year?). Andrew Gelman shares a model of the seasonal trend that defines these outliers. Births by Day of Year

To leave a comment for the author, please follow the link and comment on their blog: Revolutions. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)