Seasonal Unit Roots

Posted on March 26, 2014 by arthur charpentier in R bloggers | 0 Comments

[This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

As discussed in the MAT8181 course, there are – at least – two kinds of non-stationary time series: those with a trend, and those with a unit-root (they will be called integrated). Unit root tests cannot be used to assess whether a time series is stationary, or not. They can only detect integrated time series. And the same holds for seasonal unit root.

In a previous post, we’ve seen that it was difficult to model hourly temperature, since most test do not reject unit roots. Consider here the average monthly temperature, still in Montréal, QC.

> montreal=read.table("http://freakonometrics.free.fr/temp-montreal-monthly.txt")
> M=as.matrix(montreal[,2:13])
> X=as.numeric(t(M))
> tsm=ts(X,start=1948,freq=12)
> plot(tsm)

For those who don’t know Montréal, Winter and Summer are very different. We can visualize monthly differences using

> month=rep(1:12,length(tsm)/12)
> plot(month,as.numeric(tsm))
> lines(1:12,apply(M,2,mean),col="red",type="b",pch=19)

or, if install the uroot package, removed from the CRAN repository, we can use

> library(uroot)
> bbplot(tsm)

> bb3D(tsm)
Loading required package: tcltk

It looks like our time series is cyclic, because of the yearly seasonal pattern. The autocorrelation function is here

> acf(tsm,lag=120)

Again, this cycle can be visualized using

> persp(1948:2013,1:12,M,theta=-50,col="yellow",shade=TRUE,
+ xlab="Year",ylab="Month",zlab="Temperature",ticktype="detailed")

Now, the question is is there a seasonal unit root ? This would mean that our model should be

$(1-L^{12})\Phi(L)[X_t-\mu]=\Theta(L)\varepsilon_t$

If we forget about the autoregressive and the moving average component, we can estimate

$(1-{\color{red}\alpha}L^{12})[X_t-\mu]=\varepsilon_t$

If there is a seasonal unit root then ${\color{red}\alpha}$ should be close to 1. Somehow.

> arima(tsm,order=c(0,0,0),seasonal=list(order=c(1,0,0),period=12))

Call:
arima(x = tsm, order = c(0, 0, 0), seasonal = list(order = c(1, 0, 0), period = 12))

Coefficients:
        sar1  intercept
      0.9702     6.4566
s.e.  0.0071     2.1515

It is not far away from 1. Actually, it cannot be too close to 1. If it was, then we would get an error message…

To illustrate some interesting models, let us consider also quarterly temperatures,

> N=cbind(apply(montreal[,2:4],1,sum),apply(montreal[,5:7],1,sum),apply(montreal[,8:10],1,sum),apply(montreal[,11:13],1,sum))
> X=as.numeric(t(N))
> tsq=ts(X,start=1948,freq=4)
> persp(1948:2013,1:4,N,theta=-50,col="yellow",shade=TRUE,
+ xlab="Year",ylab="Quarter",zlab="Temperature",ticktype="detailed")

(again, the aim is just to be able to write down some equations, if necessary)

Why not consider a $VAR(1)$ model on the quarterly temperature? Something like

$\left(\begin{matrix}X_t^{(Q1)}\\ X_t^{(Q2)\\ X_t^{(Q3)\\ X_t^{(Q4)\end{matrix}\right)= A \left(\begin{matrix}X_{t-1}^{(Q1)}\\ X_{t-1}^{(Q2)\\ X_{t-1}^{(Q3)\\ X_{t-1}^{(Q4)\end{matrix}\right)+\left(\begin{matrix}\varepsilon_{t-1}^{(Q1)}\\ \varepsilon_{t-1}^{(Q2)\\ \varepsilon_{t-1}^{(Q3)\\ \varepsilon_{t-1}^{(Q4)\end{matrix}\right)$

i.e.

$\boldsymbol{Y}_t=A \boldsymbol{Y}_{t-1}+\boldsymbol{\varepsilon}_t$

where $A$ is some matrix $4\times 4$ . This model can easily be estimated,

> library(vars)
> df=data.frame(N)
> names(df)=paste("y",1:4,sep="")
> model=VAR(df)
> model

VAR Estimation Results:
======================= 

Estimated coefficients for equation y1: 
======================================= 
Call:
y1 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const 

       y1.l1        y2.l1        y3.l1        y4.l1        const 
 -0.13943065   0.21451118   0.08921237   0.30362065 -34.74793931 

Estimated coefficients for equation y2: 
======================================= 
Call:
y2 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const 

      y1.l1       y2.l1       y3.l1       y4.l1       const 
 0.02520938  0.05288958 -0.13277377  0.05134148 40.68955266 

Estimated coefficients for equation y3: 
======================================= 
Call:
y3 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const 

      y1.l1       y2.l1       y3.l1       y4.l1       const 
 0.07740824 -0.21142726  0.11180518  0.12963931 56.81087283 

Estimated coefficients for equation y4: 
======================================= 
Call:
y4 = y1.l1 + y2.l1 + y3.l1 + y4.l1 + const 

      y1.l1       y2.l1       y3.l1       y4.l1       const 
 0.18842863 -0.31964579  0.25099508 -0.04452577  5.73228873

and matrix $A$ is here

> A=rbind(
+ coefficients(model$varresult$y1)[1:4],
+ coefficients(model$varresult$y2)[1:4],
+ coefficients(model$varresult$y3)[1:4],
+ coefficients(model$varresult$y4)[1:4])
> A
           y1.l1       y2.l1       y3.l1       y4.l1
[1,] -0.13943065  0.21451118  0.08921237  0.30362065
[2,]  0.02520938  0.05288958 -0.13277377  0.05134148
[3,]  0.07740824 -0.21142726  0.11180518  0.12963931
[4,]  0.18842863 -0.31964579  0.25099508 -0.04452577

Since stationary of this multiple time series is closely related to eigenvalues of this matrix, let us look at them,

> eigen(A)$values
[1]  0.35834830 -0.32824657 -0.14042175  0.09105836
> Mod(eigen(A)$values)
[1] 0.35834830 0.32824657 0.14042175 0.09105836

So it looks like there is no stationarity issue, here. A restricted model is the periodic autoregressive model, so called $PAR(1)$ model, discussed by Paap and Franses

$\boldsymbol{\Phi}_0 \boldsymbol{Y}_t=\boldsymbol{\Phi}_1 \boldsymbol{Y}_{t-1}+\boldsymbol{\varepsilon}_t$

where

$\boldsymbol{\Phi}_0=\left(\begin{array}{cccc}1&0&0&0\\ -\phi_{1,2}&1&0&0\\ 0&-\phi_{1,3}&1&0\\ 0&0&-\phi_{1,4}&1\end{array}\right)$

and

$\boldsymbol{\Phi}_1=\left(\begin{array}{cccc}0&0&0&\phi_{1,1}\\ 0&0&0&0\\ 0&0&0&0\\ 0&0&0&0\end{array}\right)$

Keep in mind that this is a $VAR(1)$ model, since

$\boldsymbol{Y}_t=\underbrace{\boldsymbol{\Phi}_0^{-1} \boldsymbol{\Phi}_1}_A \boldsymbol{Y}_{t-1}+\boldsymbol{\varepsilon}_t$

This model can be estimated using a specific package (one can also look at the vignette, to get a better understanding of the syntax)

> library(partsm)
> detcomp <- list(regular=c(0,0,0), seasonal=c(1,0), regvar=0)
> model=fit.ar.par(wts=tsq, detcomp=detcomp, type="PAR", p=1)
> PAR.MVrepr(model)
----
    Multivariate representation of a PAR model.

  Phi0:

  1.000  0.000  0.000 0
 -0.242  1.000  0.000 0
  0.000 -0.261  1.000 0
  0.000  0.000 -0.492 1

  Phi1:

 0 0 0 0.314
 0 0 0 0.000
 0 0 0 0.000
 0 0 0 0.000

  Eigen values of Gamma = Phi0^{-1} %*% Phi1:
0.01 0 0 0 

  Time varing accumulation of shocks:

 0.010 0.040 0.155 0.314
 0.002 0.010 0.037 0.076
 0.001 0.003 0.010 0.020
 0.000 0.001 0.005 0.010

Here, the characteristic equation is

$|\boldsymbol{\Phi}_0-\boldsymbol{\Phi}_1 z |=(1-\phi_{1,1}\phi_{1,2}\phi_{1,3}\phi_{1,4}\cdot z)$

so there is a (seasonal) unit root if

$\phi_{1,1}\phi_{1,2}\phi_{1,3}\phi_{1,4}=1$

Which is clearly not the case, here. It is possible to perform Canova-Hansen test,

> CH.test(tsq)

  ------ - ------ ----
  Canova & Hansen test
  ------ - ------ ----

  Null hypothesis: Stationarity.
  Alternative hypothesis: Unit root.
  Frequency of the tested cycles: pi/2 , pi , 

  L-statistic: 1.122  
  Lag truncation parameter: 5 

  Critical values:

  0.10 0.05 0.025 0.01
 0.846 1.01  1.16 1.35

The idea is that polynomial $(1-L^4)$ has four root, in $\mathbb{C}$ ,

$\{+1,+i,-1,-i\}$

since

$(1-L^4)=(1-L)(1+L)(1+L^2)=(1-L)(1+L)(1-iL)(1+iL)$

If we get back to monthly data, $(1-L^{12})$ has twelve roots,

$\left\{\pm 1, \ \pm i, \ \pm\frac{1\pm \sqrt{3}i}{2},\pm\frac{\sqrt{3}\pm i}{2}\right\}$

each of them having different interpretations.

Here we can have 1 cycle per year (on 12 months), 2 cycles per year (on 6 months), 3 cycles per year (on 4 months), 4 cycles per year (on 3 months), even 6 cycles per year (on 2 months). This will depend on the argument of the root, with respectively

$\left\{\frac{\pi}{6},\frac{\pi}{3},\frac{\pi}{2},\frac{2\pi}{3},\frac{5\pi}{6},\pi\right\}$

The output of the test is here

> CH.test(tsm)

  ------ - ------ ----
  Canova & Hansen test
  ------ - ------ ----

  Null hypothesis: Stationarity.
  Alternative hypothesis: Unit root.
  Frequency of the tested cycles: pi/6 , pi/3 , pi/2 , 2pi/3 , 5pi/6 , pi , 

  L-statistic: 1.964  
  Lag truncation parameter: 20 

  Critical values:

 0.10 0.05 0.025 0.01
 2.49 2.75  2.99 3.27

And it looks like we reject the assumption of a seasonal unit root. I can even mention the following testing procedure

> library(forecast)
> nsdiffs(tsm, test="ch")
[1] 0

where the ouput: “1″ means that there is a seasonal unit root and “0″ that there is no seasonal unit root. Simple to read, isn’t it? If we consider the periodic autoregressive model on the monthly data, the output is

> model=fit.ar.par(wts=tsm, detcomp=detcomp, type="PAR", p=1)
> model
----
  PAR model of order 1 .

  y_t = alpha_{1,s}*y_{t-1} + alpha_{2,s}*y_{t-2} + ... + alpha_{p,s}*y_{t-p} + coeffs*detcomp + epsilon_t,  for s=1,2,...,12
----
  Autoregressive coefficients. 

          s=1  s=2  s=3  s=4  s=5 s=6 s=7  s=8  s=9 s=10 s=11 s=12
alpha_1s 0.15 0.05 0.07 0.33 0.11   0 0.3 0.38 0.31 0.19 0.15 0.37

So, whatever the test, we always reject the assumption that there is a seasonal unit root. Which does not mean that we can not have a strong cycle! Actually, the series is almost periodic. But there is no unit root! So all of this makes sense (I hardly believe that there might be unit root – seasonal, or not – in temperatures).

Just to make sure that we get it right, consider two times series, inspired from the previous one. The first one is a periodic sequence (with a very very small noise, just to avoid problem of non-definite matrices) and the second one is clearly integrated.

> Xp1=Xp2=as.numeric(t(M))
> for(t in 13:length(M)){
+ Xp1[t]=Xp1[t-12]
+ Xp2[t]=Xp2[t-12]+rnorm(1,0,2)
+ }
> Xp1=Xp1+rnorm(length(Xp1),0,.02)
> tsp1=ts(Xp1,start=1948,freq=12)
> tsp2=ts(Xp2,start=1948,freq=12)
> par(mfrow=c(2,1))
> plot(tsp1)
> plot(tsp2)