(This article was first published on

Just after arriving in Montréal, at the beginning of September, I
discussed statistics of my blog, and said that it might be possible - or
likely - that by new year's Eve, over a million page would have been
viewed on my blog (from Google's counter, here). By the end of October (here) I was very optimistic, but mi-December (here) the challenge was likely to be failed. An indeed, the million page target was hit one week after, on January 8th,**Freakonometrics - Tag - R-english**, and kindly contributed to R-bloggers)base=read.table("http://freakonometrics.blog.free.fr/public/data/million1.csv",sep="t",header=TRUE)X1=cumsum(base$nombre)X0=X1base=read.table("http://freakonometrics.blog.free.fr/public/data/million2.csv",sep="t",header=TRUE)X2=cumsum(base$nombre)X=X1+X2 D0=as.Date("08/11/2008","%d/%m/%Y")D=D0+1:length(X1)plot(D,X1,xlim=c(as.Date("08/06/2010","%d/%m/%Y"),as.Date("08/02/2011","%d/%m/%Y")),ylim=c(800000,1050000))abline(h=1000000,col="red")abline(v=as.Date("01/01/2011","%d/%m/%Y"),col="red")points(D,X,col="blue")

Again, the **black** points were from the previous blog (http://blogperso.univ-rennes1.fr/arthur.charpentier/) which was transferred to that new one (http://freakonometrics.blog.free.fr) this Autumn. So I just sum up the stats to get the blue points.
At each date, I fit an ARIMA, and use it to make forecast the
total number of pages viewed on January 1st, and calculate the
probability to reach a million page viewed at that date (using a
Gaussian ARIMA model). Actually, here, I changed a little bit the
challenge, and asked "what would have been the probability to reach a million page viewed on January 1st, and on January 8th" ?

kt=which(D==as.Date("01/06/2010","%d/%m/%Y"))Xbase=XX=X1+X2P1=P2=rep(NA,(length(X)-kt)+7)for(h in 0:(length(X)-kt+7)){model <- arima(X[1:(kt+h)],c(7 ,1,7),method="CSS")forecast <- predict(model,200) u=max(D[1:kt+h])+1:300if(min(u)<=as.Date("01/01/2011","%d/%m/%Y")){k=which(u==as.Date("01/01/2011","%d/%m/%Y"))(P1[h+1]=1-pnorm(1000000,forecast$pred[k],forecast$se[k]))}k=which(u==as.Date("08/01/2011","%d/%m/%Y"))(P2[h+1]=1-pnorm(1000000,forecast$pred[k],forecast$se[k]))}

To

**leave a comment**for the author, please follow the link and comment on his blog:**Freakonometrics - Tag - R-english**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...