**Wiekvoet**, and kindly contributed to R-bloggers)

I am a listener to BBC‘s podcast More or Less. In the program Tim Harford looks at data with both humour and determination to find what the numbers mean. Last week he handled a listener question. Does everybody get a significant birthday (20, 30 years etc.) in a weekend. His back of the envelope answer was, yes, we think so. Given the regularity by which days shift over the years, it should happen by 60 latest. If anybody has this at 70, please contact us. This blog post tries to answer by which youngest age anybody born in the previous century has a significant birthday in the weekend and concludes that indeed by 60 latest it should have happened. Not surprising, 20 years has 2/7=28% of days. Somewhat surprising, 30, 40 and 50 are approximately equally probable (3/14=21%).

Additional note. I was not the only one who did this calculation. This weekend he mentioned some other listeners who emailed equivalent results to him.

### Code

# days contains all days as month-day in a year.

days <- format(as.Date(‘1900-01-01′,format=’%Y-%m-%d’)+0:364,’%m-%d’)

tail(days)

[1] “12-26″ “12-27″ “12-28″ “12-29″ “12-30″ “12-31″

# does a year day combination fall in weekend,

# format makes Sunday to ‘0’, Saturday to ‘6’

isspecday <- function(y,d) {

i <- paste(y,d,sep=’-‘)

format(as.Date(i,format=’%Y-%m-%d’),’%w’) %in% c(‘0′,’6′)

}

# which years have a weekend, for a given day (month day combination)

sourcedata <- function(dayno) {

year <- 1900:2100

days <- isspecday(year,days[dayno])

data.frame(year=year,sunday=days)

}

# example – use fifth day

sd1 <- sourcedata(5)

# reorganize in matrix so 20 , 30.. years after get next to each other

sd2 <- sapply(seq(20,80,10),function(x) sd1$sunday[(1:100)+x])

# make columns to ages

# other years get 1000, so min() can extract minimum age

sd3 <- sd2 * rep(1,100) %o% seq(20,80,10)

sd3[sd3==0] <- 1000

tail(sd3)

[,1] [,2] [,3] [,4] [,5] [,6] [,7]

[95,] 20 1000 1000 1000 1000 70 1000

[96,] 1000 30 1000 1000 1000 1000 80

[97,] 1000 1000 40 1000 1000 1000 80

[98,] 1000 1000 1000 50 1000 1000 1000

[99,] 1000 1000 1000 50 60 1000 1000

[100,] 20 1000 1000 1000 60 70 1000

# result, ages

table(apply(sd3,1,min))

20 30 40 50 60

28 22 21 22 7

# make a wrapper, i to become day

goodyear <- function(i) {

sd1 <- sourcedata(i)

sd2 <- sapply(seq(20,80,10),function(x) sd1$sunday[(1:100)+x])

sd3 <- sd2 * rep(1,100) %o% seq(20,80,10)

sd3[sd3==0] <- 1000

table(apply(sd3,1,min))

}

# and apply

sa <- sapply(1:365,goodyear)

# for brievety, only show last days

tail(t(sa))

20 30 40 50 60

[360,] 28 22 22 21 7

[361,] 28 22 21 22 7

[362,] 28 21 22 21 8

[363,] 29 21 21 22 7

[364,] 29 21 21 22 7

[365,] 29 22 21 21 7

# overall

colSums(t(sa))/365

20 30 40 50 60

28.572603 21.430137 21.424658 21.430137 7.142466

**leave a comment**for the author, please follow the link and comment on their blog:

**Wiekvoet**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...