Seasonal Unit Roots

March 26, 2014
By

(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers)

As discussed in the MAT8181 course, there are – at least – two kinds of non-stationary time series: those with a trend, and those with a unit-root (they will be called integrated). Unit root tests cannot be used to assess whether a time series is stationary, or not. They can only detect integrated time series. And the same holds for seasonal unit root.

In a previous post, we’ve seen that it was difficult to model hourly temperature, since most test do not reject unit roots. Consider here the average monthly temperature, still in Montréal, QC.

> montreal=read.table("http://freakonometrics.free.fr/temp-montreal-monthly.txt")
> M=as.matrix(montreal[,2:13])
> X=as.numeric(t(M))
> tsm=ts(X,start=1948,freq=12)
> plot(tsm)

For those who don’t know Montréal, Winter and Summer are very different. We can visualize monthly differences using

> month=rep(1:12,length(tsm)/12)
> plot(month,as.numeric(tsm))
> lines(1:12,apply(M,2,mean),col="red",type="b",pch=19)

or, if install the uroot package, removed from the CRAN repository, we can use

> library(uroot)
> bbplot(tsm)

or

> bb3D(tsm)
Loading required package: tcltk

It looks like our time series is cyclic, because of the yearly seasonal pattern. The autocorrelation function is here

> acf(tsm,lag=120)

Again, this cycle can be visualized using

> persp(1948:2013,1:12,M,theta=-50,col="yellow",shade=TRUE,
+ xlab="Year",ylab="Month",zlab="Temperature",ticktype="detailed")

Now, the question is is there a seasonal unit root ? This would mean that our model should be

If we forget about the autoregressive and the moving average component, we can estimate

If there is a seasonal unit root then  should be close to 1. Somehow.

> arima(tsm,order=c(0,0,0),seasonal=list(order=c(1,0,0),period=12))

Call:
arima(x = tsm, order = c(0, 0, 0), seasonal = list(order = c(1, 0, 0), period = 12))

Coefficients:
        sar1  intercept
      0.9702     6.4566
s.e.  0.0071     2.1515

It is not far away from 1. Actually, it cannot be too close to 1. If it was, then we would get an error message…

To illustrate some interesting models, let us consider also quarterly temperatures,

> N=cbind(apply(montreal[,2:4],1,sum),apply(montreal[,5:7],1,sum),apply(montreal[,8:10],1,sum),apply(montreal[,11:13],1,sum))
> X=as.numeric(t(N))
> tsq=ts(X,start=1948,freq=4)
> persp(1948:2013,1:4,N,theta=-50,col="yellow",shade=TRUE,
+ xlab="Year",ylab="Quarter",zlab="Temperature",ticktype="detailed")

(again, the aim is just to be able to write down some equations, if necessary)

Why not consider a  model on the quarterly temperature? Something like

i.e.

where  is some matrix . This model can easily be estimated,

> library(vars)
> df=data.frame(N)
> names(df)=paste("y",1:4,sep="")
> model=VAR(df)
> model

VAR Estimation Results:
======================= 

Estimated coefficients for equation y1: 
======================================= 
Call:
y1 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const 

       y1.l1        y2.l1        y3.l1        y4.l1        const 
 -0.13943065   0.21451118   0.08921237   0.30362065 -34.74793931 

Estimated coefficients for equation y2: 
======================================= 
Call:
y2 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const 

      y1.l1       y2.l1       y3.l1       y4.l1       const 
 0.02520938  0.05288958 -0.13277377  0.05134148 40.68955266 

Estimated coefficients for equation y3: 
======================================= 
Call:
y3 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const 

      y1.l1       y2.l1       y3.l1       y4.l1       const 
 0.07740824 -0.21142726  0.11180518  0.12963931 56.81087283 

Estimated coefficients for equation y4: 
======================================= 
Call:
y4 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const 

      y1.l1       y2.l1       y3.l1       y4.l1       const 
 0.18842863 -0.31964579  0.25099508 -0.04452577  5.73228873

and matrix  is here

> A=rbind(
+ coefficients(model$varresult$y1)[1:4],
+ coefficients(model$varresult$y2)[1:4],
+ coefficients(model$varresult$y3)[1:4],
+ coefficients(model$varresult$y4)[1:4])
> A
           y1.l1       y2.l1       y3.l1       y4.l1
[1,] -0.13943065  0.21451118  0.08921237  0.30362065
[2,]  0.02520938  0.05288958 -0.13277377  0.05134148
[3,]  0.07740824 -0.21142726  0.11180518  0.12963931
[4,]  0.18842863 -0.31964579  0.25099508 -0.04452577

Since stationary of this multiple time series is closely related to eigenvalues of this matrix, let us look at them,

> eigen(A)$values
[1]  0.35834830 -0.32824657 -0.14042175  0.09105836
> Mod(eigen(A)$values)
[1] 0.35834830 0.32824657 0.14042175 0.09105836

So it looks like there is no stationarity issue, here. A restricted model is the periodic autoregressive model, so called  model, discussed by Paap and Franses

where

and

Keep in mind that this is a  model, since

This model can be estimated using a specific package (one can also look at the vignette, to get a better understanding of the syntax)

> library(partsm)
> detcomp <- list(regular=c(0,0,0), seasonal=c(1,0), regvar=0)
> model=fit.ar.par(wts=tsq, detcomp=detcomp, type="PAR", p=1)
> PAR.MVrepr(model)
----
    Multivariate representation of a PAR model.

  Phi0:

  1.000  0.000  0.000 0
 -0.242  1.000  0.000 0
  0.000 -0.261  1.000 0
  0.000  0.000 -0.492 1

  Phi1:

 0 0 0 0.314
 0 0 0 0.000
 0 0 0 0.000
 0 0 0 0.000

  Eigen values of Gamma = Phi0^{-1} %*% Phi1:
0.01 0 0 0 

  Time varing accumulation of shocks:

 0.010 0.040 0.155 0.314
 0.002 0.010 0.037 0.076
 0.001 0.003 0.010 0.020
 0.000 0.001 0.005 0.010

Here, the characteristic equation is

so there is a (seasonal) unit root if

Which is clearly not the case, here. It is possible to perform Canova-Hansen test,

> CH.test(tsq)

  ------ - ------ ----
  Canova & Hansen test
  ------ - ------ ----

  Null hypothesis: Stationarity.
  Alternative hypothesis: Unit root.
  Frequency of the tested cycles: pi/2 , pi , 

  L-statistic: 1.122  
  Lag truncation parameter: 5 

  Critical values:

  0.10 0.05 0.025 0.01
 0.846 1.01  1.16 1.35

The idea is that polynomial  has four root, in ,

since

If we get back to monthly data,  has twelve roots,

each of them having different interpretations.

Here we can have 1 cycle per year (on 12 months), 2 cycles per year (on 6 months), 3 cycles per year (on 4 months), 4 cycles per year (on 3 months), even 6 cycles per year (on 2 months). This will depend on the argument of the root, with respectively

The output of the test is here

> CH.test(tsm)

  ------ - ------ ----
  Canova & Hansen test
  ------ - ------ ----

  Null hypothesis: Stationarity.
  Alternative hypothesis: Unit root.
  Frequency of the tested cycles: pi/6 , pi/3 , pi/2 , 2pi/3 , 5pi/6 , pi , 

  L-statistic: 1.964  
  Lag truncation parameter: 20 

  Critical values:

 0.10 0.05 0.025 0.01
 2.49 2.75  2.99 3.27

And it looks like we reject the assumption of a seasonal unit root. I can even mention the following testing procedure

> library(forecast)
> nsdiffs(tsm, test="ch")
[1] 0

where the ouput: “1″ means that there is a seasonal unit root and “0″ that there is no seasonal unit root. Simple to read, isn’t it? If we consider the periodic autoregressive model on the monthly data, the output is

> model=fit.ar.par(wts=tsm, detcomp=detcomp, type="PAR", p=1)
> model
----
  PAR model of order 1 .

  y_t = alpha_{1,s}*y_{t-1} + alpha_{2,s}*y_{t-2} + ... + alpha_{p,s}*y_{t-p} + coeffs*detcomp + epsilon_t,  for s=1,2,...,12
----
  Autoregressive coefficients. 

          s=1  s=2  s=3  s=4  s=5 s=6 s=7  s=8  s=9 s=10 s=11 s=12
alpha_1s 0.15 0.05 0.07 0.33 0.11   0 0.3 0.38 0.31 0.19 0.15 0.37

So, whatever the test, we always reject the assumption that there is a seasonal unit root. Which does not mean that we can not have a strong cycle! Actually, the series is almost periodic. But there is no unit root! So all of this makes sense (I hardly believe that there might be unit root – seasonal, or not – in temperatures).

Just to make sure that we get it right, consider two times series, inspired from the previous one. The first one is a periodic sequence (with a very very small noise, just to avoid problem of non-definite matrices) and the second one is clearly integrated.

> Xp1=Xp2=as.numeric(t(M))
> for(t in 13:length(M)){
+ Xp1[t]=Xp1[t-12]
+ Xp2[t]=Xp2[t-12]+rnorm(1,0,2)
+ }
> Xp1=Xp1+rnorm(length(Xp1),0,.02)
> tsp1=ts(Xp1,start=1948,freq=12)
> tsp2=ts(Xp2,start=1948,freq=12)
> par(mfrow=c(2,1))
> plot(tsp1)
> plot(tsp2)

see also

> par(mfrow=c(1,2))
> bb3D(tsp1)
> bb3D(tsp2)

If we quickly look at those series, I would say that the first one has no unit root – even if it is not stationary, but it is because the series is periodic – while there is (are ?) unit root(s) for the second one. If we look at Canova-Hansen test, we get

> CH.test(tsp1)

  ------ - ------ ----
  Canova & Hansen test
  ------ - ------ ----

  Null hypothesis: Stationarity.
  Alternative hypothesis: Unit root.
  Frequency of the tested cycles: pi/6 , pi/3 , pi/2 , 2pi/3 , 5pi/6 , pi , 

  L-statistic: 2.234
  Lag truncation parameter: 20 

  Critical values:

 0.10 0.05 0.025 0.01
 2.49 2.75  2.99 3.27

> CH.test(tsp2)

  ------ - ------ ----
  Canova & Hansen test
  ------ - ------ ----

  Null hypothesis: Stationarity.
  Alternative hypothesis: Unit root.
  Frequency of the tested cycles: pi/6 , pi/3 , pi/2 , 2pi/3 , 5pi/6 , pi , 

  L-statistic: 5.448  
  Lag truncation parameter: 20 

  Critical values:

 0.10 0.05 0.025 0.01
 2.49 2.75  2.99 3.27

I know that this package has been removed, so maybe I should not use it. Consider instead

> nsdiffs(tsp1, 12,test="ch")
[1] 0
> nsdiffs(tsp2, 12,test="ch")
[1] 1

Here we have the same conclusion. The first one does not have a unit root, but the second one has. But be careful: with Osborn-Chui-Smith-Birchenhall test,

> nsdiffs(tsp1, 12,test="ocsb")
[1] 1
> nsdiffs(tsp2, 12,test="ocsb")
[1] 1

we have the feeling that there is also a unit root in our cyclic series…

So here, on a short-frequency, we do reject the assumption of a unit root – even a seasonal one – on our temperature series. We still have our high-frequency problem to deal with, some day (but I don’t think I’ll have enough time to introduce long range dependence this session, unfortunately).

To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics » R-english.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.