Last week, Joe Rickert used R and four years of US Census data to create an image plot of the relative probabilities of being born on a given day of the year:

Chris Mulligan also tackled this problem with R, but this time using 20 years of Census data from 1969 to 1988. Chris extracted the birthday frequencies using Google BigQuery, and charted the results with the time series below using this R script.

My apologies to Joe, but I much prefer this representation to the heat map. Not only is the February 29 frequency multiplied by 4 (where we see that it's not a particularly surprising birthday to have given the overall seasonal trend), but the unusual days really stand out (and are annotated). You're relatively unlikely to find someone born on January 1, July 4 or Christmas Eve or Christmas Day (most likely because fewer Caesarian births happen, or more induced natural births are avoided, on those days). December 30 is a more likely birthday that you'd otherwise expect (maybe this has something to do with getting kids into an earlier school year?). Andrew Gelman shares a model of the seasonal trend that defines these outliers.

chmullig.com: Births by Day of Year

*Related*

To

**leave a comment** for the author, please follow the link and comment on their blog:

** Revolutions**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as:

Data science,

Big Data, R jobs, visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...

**Tags:** applications, Big Data, Data Science, R