(This article was first published on

**Freakonometrics » R-english**, and kindly contributed to R-bloggers)On Twitter, I was asked if there were serious research papers published on coffee consumption and labour productivity. There are some papers on coffee breaks and productivity, e.g. Productivity Through Coffee Breaks, but I could not find anything on coffee consumptions. Since I could not find any dataset with personal consumption (maybe I should start keeping tracks of my own consumption to run a study) I tried to find data for national consumption instead (even if we know that both are – clearly – not equivalent)

- last year, Sabine published on http://backreaction.blogspot.fr/ a dataset with consumption of coffee, per country (and per unhabitants),
- on http://en.wikipedia.org/ we can find a dataset with GDP per hour worked for some countries (which can be seen as a common measure of the productivity of a country)

If we merge those two datasets, we get

> base=read.table( + "http://freakonometrics.free.fr/cafe.csv", + header=TRUE,sep=";",dec=",") > b=base[!is.na(base$GDP.PPP),] > plot(b[,3],b[,4],xlab="Coffee Consumption", + ylab="GDP per hour worked") > text(b[,3],b[,4]+1.6,b[,1],cex=.6) > library(splines) > X=b[,3] > Y=b[,4] > B=data.frame(X,Y) > reg=glm(Y~bs(X),data=B) > y=predict(reg,newdata=data.frame( + X=seq(0,10,by=.1))) > lines(seq(0,10,by=.1),y,col="red")

To

**leave a comment**for the author, please follow the link and comment on their blog:**Freakonometrics » R-english**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...