**Freakonometrics » R-english**, and kindly contributed to R-bloggers)

As discussed in the MAT8181 course, there are – at least – two kinds of non-stationary time series: those with a trend, and those with a unit-root (they will be called *integrated*). Unit root tests cannot be used to assess whether a time series is stationary, or not. They can only detect integrated time series. And the same holds for *seasonal unit root*.

In a previous post, we’ve seen that it was difficult to model hourly temperature, since most test do not reject unit roots. Consider here the average *monthly* temperature, still in Montréal, QC.

> montreal=read.table("http://freakonometrics.free.fr/temp-montreal-monthly.txt") > M=as.matrix(montreal[,2:13]) > X=as.numeric(t(M)) > tsm=ts(X,start=1948,freq=12) > plot(tsm)

For those who don’t know Montréal, Winter and Summer are very different. We can visualize monthly differences using

> month=rep(1:12,length(tsm)/12) > plot(month,as.numeric(tsm)) > lines(1:12,apply(M,2,mean),col="red",type="b",pch=19)

or, if install the uroot package, removed from the CRAN repository, we can use

> library(uroot) > bbplot(tsm)

or

> bb3D(tsm) Loading required package: tcltk

It looks like our time series is cyclic, because of the yearly seasonal pattern. The autocorrelation function is here

> acf(tsm,lag=120)

Again, this cycle can be visualized using

> persp(1948:2013,1:12,M,theta=-50,col="yellow",shade=TRUE, + xlab="Year",ylab="Month",zlab="Temperature",ticktype="detailed")

Now, the question is *is there a seasonal unit root *? This would mean that our model should be

If we forget about the autoregressive and the moving average component, we can estimate

If there is a *seasonal unit root* then should be close to 1. Somehow.

> arima(tsm,order=c(0,0,0),seasonal=list(order=c(1,0,0),period=12)) Call: arima(x = tsm, order = c(0, 0, 0), seasonal = list(order = c(1, 0, 0), period = 12)) Coefficients: sar1 intercept 0.9702 6.4566 s.e. 0.0071 2.1515

It is not far away from 1. Actually, it cannot be too close to 1. If it was, then we would get an error message…

To illustrate some interesting models, let us consider also *quarterly* temperatures,

> N=cbind(apply(montreal[,2:4],1,sum),apply(montreal[,5:7],1,sum),apply(montreal[,8:10],1,sum),apply(montreal[,11:13],1,sum)) > X=as.numeric(t(N)) > tsq=ts(X,start=1948,freq=4) > persp(1948:2013,1:4,N,theta=-50,col="yellow",shade=TRUE, + xlab="Year",ylab="Quarter",zlab="Temperature",ticktype="detailed")

(again, the aim is just to be able to write down some equations, if necessary)

Why not consider a model on the *quarterly *temperature? Something like

i.e.

where is some matrix . This model can easily be estimated,

> library(vars) > df=data.frame(N) > names(df)=paste("y",1:4,sep="") > model=VAR(df) > model VAR Estimation Results: ======================= Estimated coefficients for equation y1: ======================================= Call: y1 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const y1.l1 y2.l1 y3.l1 y4.l1 const -0.13943065 0.21451118 0.08921237 0.30362065 -34.74793931 Estimated coefficients for equation y2: ======================================= Call: y2 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const y1.l1 y2.l1 y3.l1 y4.l1 const 0.02520938 0.05288958 -0.13277377 0.05134148 40.68955266 Estimated coefficients for equation y3: ======================================= Call: y3 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const y1.l1 y2.l1 y3.l1 y4.l1 const 0.07740824 -0.21142726 0.11180518 0.12963931 56.81087283 Estimated coefficients for equation y4: ======================================= Call: y4 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const y1.l1 y2.l1 y3.l1 y4.l1 const 0.18842863 -0.31964579 0.25099508 -0.04452577 5.73228873

and matrix is here

> A=rbind( + coefficients(model$varresult$y1)[1:4], + coefficients(model$varresult$y2)[1:4], + coefficients(model$varresult$y3)[1:4], + coefficients(model$varresult$y4)[1:4]) > A y1.l1 y2.l1 y3.l1 y4.l1 [1,] -0.13943065 0.21451118 0.08921237 0.30362065 [2,] 0.02520938 0.05288958 -0.13277377 0.05134148 [3,] 0.07740824 -0.21142726 0.11180518 0.12963931 [4,] 0.18842863 -0.31964579 0.25099508 -0.04452577

Since stationary of this multiple time series is closely related to eigenvalues of this matrix, let us look at them,

> eigen(A)$values [1] 0.35834830 -0.32824657 -0.14042175 0.09105836 > Mod(eigen(A)$values) [1] 0.35834830 0.32824657 0.14042175 0.09105836

So it looks like there is no stationarity issue, here. A restricted model is the *periodic autoregressive model*, so called model, discussed by Paap and Franses

where

and

Keep in mind that this is a model, since

This model can be estimated using a specific package (one can also look at the vignette, to get a better understanding of the syntax)

> library(partsm) > detcomp <- list(regular=c(0,0,0), seasonal=c(1,0), regvar=0) > model=fit.ar.par(wts=tsq, detcomp=detcomp, type="PAR", p=1) > PAR.MVrepr(model) ---- Multivariate representation of a PAR model. Phi0: 1.000 0.000 0.000 0 -0.242 1.000 0.000 0 0.000 -0.261 1.000 0 0.000 0.000 -0.492 1 Phi1: 0 0 0 0.314 0 0 0 0.000 0 0 0 0.000 0 0 0 0.000 Eigen values of Gamma = Phi0^{-1} %*% Phi1: 0.01 0 0 0 Time varing accumulation of shocks: 0.010 0.040 0.155 0.314 0.002 0.010 0.037 0.076 0.001 0.003 0.010 0.020 0.000 0.001 0.005 0.010

Here, the characteristic equation is

so there is a (seasonal) unit root if

Which is clearly not the case, here. It is possible to perform Canova-Hansen test,

> CH.test(tsq) ------ - ------ ---- Canova & Hansen test ------ - ------ ---- Null hypothesis: Stationarity. Alternative hypothesis: Unit root. Frequency of the tested cycles: pi/2 , pi , L-statistic: 1.122 Lag truncation parameter: 5 Critical values: 0.10 0.05 0.025 0.01 0.846 1.01 1.16 1.35

The idea is that polynomial has four root, in ,

since

If we get back to monthly data, has twelve roots,

each of them having different interpretations.

Here we can have 1 cycle per year (on 12 months), 2 cycles per year (on 6 months), 3 cycles per year (on 4 months), 4 cycles per year (on 3 months), even 6 cycles per year (on 2 months). This will depend on the argument of the root, with respectively

The output of the test is here

> CH.test(tsm) ------ - ------ ---- Canova & Hansen test ------ - ------ ---- Null hypothesis: Stationarity. Alternative hypothesis: Unit root. Frequency of the tested cycles: pi/6 , pi/3 , pi/2 , 2pi/3 , 5pi/6 , pi , L-statistic: 1.964 Lag truncation parameter: 20 Critical values: 0.10 0.05 0.025 0.01 2.49 2.75 2.99 3.27

And it looks like we reject the assumption of a *seasonal *unit root. I can even mention the following testing procedure

> library(forecast) > nsdiffs(tsm, test="ch") [1] 0

where the ouput: “1″ means that there is a seasonal unit root and “0″ that there is no seasonal unit root. Simple to read, isn’t it? If we consider the periodic autoregressive model on the monthly data, the output is

> model=fit.ar.par(wts=tsm, detcomp=detcomp, type="PAR", p=1) > model ---- PAR model of order 1 . y_t = alpha_{1,s}*y_{t-1} + alpha_{2,s}*y_{t-2} + ... + alpha_{p,s}*y_{t-p} + coeffs*detcomp + epsilon_t, for s=1,2,...,12 ---- Autoregressive coefficients. s=1 s=2 s=3 s=4 s=5 s=6 s=7 s=8 s=9 s=10 s=11 s=12 alpha_1s 0.15 0.05 0.07 0.33 0.11 0 0.3 0.38 0.31 0.19 0.15 0.37

So, whatever the test, we always reject the assumption that there is a seasonal unit root. Which does not mean that we can not have a strong cycle! Actually, the series is almost periodic. But there is no unit root! So all of this makes sense (I hardly believe that there might be unit root – seasonal, or not – in temperatures).

Just to make sure that we get it right, consider two times series, inspired from the previous one. The first one is a periodic sequence (with a very very small noise, just to avoid problem of non-definite matrices) and the second one is clearly integrated.

> Xp1=Xp2=as.numeric(t(M)) > for(t in 13:length(M)){ + Xp1[t]=Xp1[t-12] + Xp2[t]=Xp2[t-12]+rnorm(1,0,2) + } > Xp1=Xp1+rnorm(length(Xp1),0,.02) > tsp1=ts(Xp1,start=1948,freq=12) > tsp2=ts(Xp2,start=1948,freq=12) > par(mfrow=c(2,1)) > plot(tsp1) > plot(tsp2)

see also

> par(mfrow=c(1,2)) > bb3D(tsp1) > bb3D(tsp2)

If we quickly look at those series, I would say that the first one has no unit root – even if it is not stationary, but it is because the series is periodic – while there is (are ?) unit root(s) for the second one. If we look at Canova-Hansen test, we get

> CH.test(tsp1) ------ - ------ ---- Canova & Hansen test ------ - ------ ---- Null hypothesis: Stationarity. Alternative hypothesis: Unit root. Frequency of the tested cycles: pi/6 , pi/3 , pi/2 , 2pi/3 , 5pi/6 , pi , L-statistic: 2.234 Lag truncation parameter: 20 Critical values: 0.10 0.05 0.025 0.01 2.49 2.75 2.99 3.27 > CH.test(tsp2) ------ - ------ ---- Canova & Hansen test ------ - ------ ---- Null hypothesis: Stationarity. Alternative hypothesis: Unit root. Frequency of the tested cycles: pi/6 , pi/3 , pi/2 , 2pi/3 , 5pi/6 , pi , L-statistic: 5.448 Lag truncation parameter: 20 Critical values: 0.10 0.05 0.025 0.01 2.49 2.75 2.99 3.27

I know that this package has been removed, so maybe I should not use it. Consider instead

> nsdiffs(tsp1, 12,test="ch") [1] 0 > nsdiffs(tsp2, 12,test="ch") [1] 1

Here we have the same conclusion. The first one does not have a unit root, but the second one has. But be careful: with Osborn-Chui-Smith-Birchenhall test,

> nsdiffs(tsp1, 12,test="ocsb") [1] 1 > nsdiffs(tsp2, 12,test="ocsb") [1] 1

we have the feeling that there is also a unit root in our cyclic series…

So here, on a short-frequency, we do reject the assumption of a unit root – even a seasonal one – on our temperature series. We still have our high-frequency problem to deal with, some day (but I don’t think I’ll have enough time to introduce long range dependence this session, unfortunately).

**leave a comment**for the author, please follow the link and comment on their blog:

**Freakonometrics » R-english**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...