Significant birthdays in the weekend

May 25, 2014
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)

I am a listener to BBC's podcast More or Less. In the program Tim Harford looks at data with both humour and determination to find what the numbers mean. Last week he handled a listener question. Does everybody get a significant birthday (20, 30 years etc.) in a weekend. His back of the envelope answer was, yes, we think so. Given the regularity by which days shift over the years, it should happen by 60 latest. If anybody has this at 70, please contact us. This blog post tries to answer by which youngest age anybody born in the previous century has a significant birthday in the weekend and concludes that indeed by 60 latest it should have happened. Not surprising, 20 years has 2/7=28% of days. Somewhat surprising, 30, 40 and 50 are approximately equally probable (3/14=21%).
Additional note. I was not the only one who did this calculation. This weekend he mentioned some other listeners who emailed equivalent results to him.

Code

# days contains all days as month-day in a year.
days <- format(as.Date('1900-01-01',format='%Y-%m-%d')+0:364,'%m-%d')
tail(days)
[1] "12-26" "12-27" "12-28" "12-29" "12-30" "12-31"

# does a year day combination fall in weekend,
# format makes Sunday to '0', Saturday to '6'
isspecday <- function(y,d) {
i <- paste(y,d,sep='-')
format(as.Date(i,format='%Y-%m-%d'),'%w') %in% c('0','6')
}

# which years have a weekend, for a given day (month day combination)
sourcedata <- function(dayno) {
year <- 1900:2100
days <- isspecday(year,days[dayno])
data.frame(year=year,sunday=days)
}

# example - use fifth day
sd1 <- sourcedata(5)
# reorganize in matrix so 20 , 30.. years after get next to each other
sd2 <- sapply(seq(20,80,10),function(x) sd1$sunday[(1:100)+x]) # make columns to ages # other years get 1000, so min() can extract minimum age sd3 <- sd2 * rep(1,100) %o% seq(20,80,10) sd3[sd3==0] <- 1000 tail(sd3) [,1] [,2] [,3] [,4] [,5] [,6] [,7] [95,] 20 1000 1000 1000 1000 70 1000 [96,] 1000 30 1000 1000 1000 1000 80 [97,] 1000 1000 40 1000 1000 1000 80 [98,] 1000 1000 1000 50 1000 1000 1000 [99,] 1000 1000 1000 50 60 1000 1000 [100,] 20 1000 1000 1000 60 70 1000 # result, ages table(apply(sd3,1,min)) 20 30 40 50 60 28 22 21 22 7 # make a wrapper, i to become day goodyear <- function(i) { sd1 <- sourcedata(i) sd2 <- sapply(seq(20,80,10),function(x) sd1$sunday[(1:100)+x])
sd3 <- sd2 * rep(1,100) %o% seq(20,80,10)
sd3[sd3==0] <- 1000
table(apply(sd3,1,min))
}
# and apply
sa <- sapply(1:365,goodyear)
# for brievety, only show last days
tail(t(sa))
20 30 40 50 60
[360,] 28 22 22 21  7
[361,] 28 22 21 22  7
[362,] 28 21 22 21  8
[363,] 29 21 21 22  7
[364,] 29 21 21 22  7
[365,] 29 22 21 21  7
# overall
colSums(t(sa))/365
20        30        40        50        60
28.572603 21.430137 21.424658 21.430137  7.142466