(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)
50 days ago, I published a post, here, on forecasting techniques. I was wondering what could be the probability to have, by the end of this year, one million pages viewed (from Google Analytics) on this blog. Well, initially, it was on my blog at the Université de Rennes 1 (http://blogperso.univ-rennes1.fr/arthur.charpentier/), but since I transfered the blog, I had to revise my code. Initially, I had that kind of graphs,



Here is the updated graph, with the blog à Université de Rennes 1 (still in black) and the one here (in blue, where I add the two blogs together).


> base=read.table("http://perso.univ-rennes1.fr/arthur.charpentier/million2.csv",
+ sep=";",header=TRUE)
> X2=cumsum(base$nombre)
> X=X1+X2
> kt=which(D==as.Date("01/06/2010","%d/%m/%Y"))
> D0=as.Date("08/11/2008","%d/%m/%Y")
> D=D0+1:length(X1)
> P=rep(NA,(length(X)-kt)+1)
> for(h in 0:(length(X)-kt)){
+ model <- arima(X[1:(kt+h)],c(7 , # partie AR
+ 1, # partie I
+ 7),method="CSS") # partie MA
+ forecast <- predict(model,200)
+ u=max(D[1:kt+h])+1:300
+ k=which(u==as.Date("01/01/2011","%d/%m/%Y"))
+ (P[h+1]=1-pnorm(1000000,forecast$pred[k],forecast$se[k]))
+ }
It has been a bit tricky, since I wanted an automatic fit of the ARIMA process, meaning that I had to assess a priori the orders of the ARIMA process. And I had numerical problems, since we got non stationary AR part at least at one period of time considered.... So finally I used here the CSS method which uses conditional-sum-of-squares to find starting values in the optimization procedure.

* initially I was looking for the analogous of "élucubration" in French, meaning "divagation, absurd theory" (the proper translation might be "rantings" (here) , "ravings" (here) or "wild imagining" (everywhere else here or there)). When I asked Google for a possible translation (here), I got "lucubration" which means "composed by night; that which is produced by meditation in retirement". Well, it was not initially what I intended to say, but since I usually work on my blog during the night, when I got awake by one of the girls, I decided to keep this word.... At least, I learnt something today, appart for the code mentioned above....
To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...

Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).