# Births and week-ends, in France

May 19, 2012
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

This week, I have seen on the internet (sorry, I cannot find proper references) the graph produced here on the right: which birthday is most likely ? The fact that I have no further information is important, since I do not know in which country such a graph was obtained. At least, I know it should not be France…

In France, I have already mentioned that there is a strong week-end effect: nowadays, there is 25% less deliveries during week-ends than during the week. Calot (1981) observed already that there were less deliveries on Sundays. This has been confirmed more recently, e.g. in http://www.lepoint.fr/ or http://www.prepabl.fr/, with a significant difference between week days, and week-ends. Here  is the number of birth per day, over 40 years, with in blue the average trend during the week, and in red, during week-ends,

```naissance=read.table(
"http://freakonometrics.free.fr/naissanceFR2.txt")
attach(naissance)
date=as.Date(date)
plot(date, nbre,cex=.5)
t2=as.POSIXlt(date)
jour=t2\$wday
X=naissance\$date
Y=naissance\$nbre
J=jour
df=data.frame(X,Y,J)
library(splines)
regs=lm(Y~bs(X,df=20),data=df[jour%in%c(0,6),])
Yp=predict(regs,newdata=df)
lines(X,Yp,col="red",lwd=3)
regs=lm(Y~bs(X,df=20),data=df[jour%in%1:5,])
Yp=predict(regs,newdata=df)
lines(X,Yp,col="blue",lwd=3)```

If we look at the evolution of the ratio week-ends over weeks days, we have the following graph

```t2=as.POSIXlt(date)
jour=t2\$wday
jour=jour[1:(1982*7)]
nbre2=jour
for(i in 1:1982){
taux=sum(nbre[6:7+7*(i-1)])/
sum(nbre[1:5+7*(i-1)])/2*5
nbre2[1:5+7*(i-1)]=nbre[1:5+7*(i-1)]*taux
nbre2[6:7+7*(i-1)]=nbre[6:7+7*(i-1)]
nbre2[1:7+7*(i-1)]=
mean(nbre[1:7+7*(i-1)])/mean(nbre2[1:7+7*(i-1)])*
nbre2[1:7+7*(i-1)]
}
nbretaux=jour
for(i in 1:1982){
taux=sum(nbre[6:7+7*(i-1)])/
sum(nbre[1:5+7*(i-1)])/2*5
nbretaux[1:7+7*(i-1)]=taux
}
plot(date[1:length(nbre2)],nbretaux)
X= date[1:length(nbre2)]
Y=nbretaux
library(splines)
reg=lm(Y~bs(X,df=20))
Yp=predict(reg)
lines(X,Yp,col="red",lwd=3)```

In the beginning of the 70’s, during week-ends, there were 5% less deliveries, but 25% less around 2000. It is then possible to produce the same kind of graphs as the one above, per year of birth. And here, we clearly observe the importance of the week end effect (maybe also because of color choice)

```naissance=read.csv(
"http://freakonometrics.free.fr/naissanceFR.csv",
sep=";")
M=as.matrix(naissance[,3:ncol(naissance)])
BIRTH=as.vector(t(M))
YEAR=rep(1968:2005,each=12*31)
MONTH=rep(rep(1:12,each=31),38)
DAY=rep(1:31,12*38)
X=NA
for(y in 1968:2005){
sbase=base[YEAR==y,]
X=c(X,sbase\$BIRTH/sum(sbase\$BIRTH,
na.rm=TRUE))
}
base=data.frame(YEAR,MONTH,DAY,
BIRTH,BIRTHDAYPROB=X[-1])

m1=min(base\$BIRTHDAYPROB,na.rm=TRUE)
m2=max(base\$BIRTHDAYPROB,na.rm=TRUE)
y=1980
colr=rev(heat.colors(100))
sbase=base[YEAR==y,]
plot(0:1,0:1,col="white",xlim=c(-1,12),
ylim=c(-31,1),axes=FALSE,xlab=
paste("Naissance en",y,sep=" "),ylab="")
for(x in 1:nrow(sbase)){
a=sbase\$MONTH[x];b=sbase\$DAY[x]
polygon(c(a-.9,a-.9,a-.1,a-.1),-c(b-.9,b-.1,
b-.1,b-.9),col=colr[(sbase\$BIRTHDAYPROB[x]-m1)/
(m2-m1)*100],border=NA)
}
text((1:12)-.5,.5,c("J","F","M","A","M","J","J",
"A","S","O","N","D"),cex=.7)
text(-.5,-(1:31)+.5,1:31,cex=.7)```

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

Tags: , , , , , ,