More on birthday probabilities

June 15, 2012
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Last week, Joe Rickert used R and four years of US Census data to create an image plot of the relative probabilities of being born on a given day of the year:

HeatMapofProbs

Chris Mulligan also tackled this problem with R, but this time using 20 years of Census data from 1969 to 1988. Chris extracted the birthday frequencies using Google BigQuery, and charted the results with the time series below using this R script.

Birthday probabilities

My apologies to Joe, but I much prefer this representation to the heat map. Not only is the February 29 frequency multiplied by 4 (where we see that it's not a particularly surprising birthday to have given the overall seasonal trend), but the unusual days really stand out (and are annotated). You're relatively unlikely to find someone born on January 1, July 4 or Christmas Eve or Christmas Day (most likely because fewer Caesarian births happen, or more induced natural births are avoided, on those days). December 30 is a more likely birthday that you'd otherwise expect (maybe this has something to do with getting kids into an earlier school year?). Andrew Gelman shares a model of the seasonal trend that defines these outliers.

chmullig.com: Births by Day of Year

To leave a comment for the author, please follow the link and comment on his blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.