# Inference and autoregressive processes

September 6, 2012
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers)

Consider a (stationary) autoregressive process, say of order 2,

for some white noise with variance . Here is a code to generate such a process,

> phi1=.5
> phi2=-.4
> sigma=1.5
> set.seed(1)
> n=240
> WN=rnorm(n,sd=sigma)
> Z=rep(NA,n)
> Z[1:2]=rnorm(2,0,1)
> for(t in 3:n){Z[t]=phi1*Z[t-1]+phi2*Z[t-2]+WN[t]}

Here, we have to estimate two sets of parameters: the autoregressive coefficients, and the variance of the innovation process . There are (at least) three techniques to estimate those parameters.

• using least square regression

A natural idea is to see here a regression model, and thus, if we consider a matrix formulation,

Here we can run (conditional) ordinary least squares estimation,

> base=data.frame(Y=Z[3:n],X1=Z[2:(n-1)],X2=Z[1:(n-2)])
> regression=lm(Y~0+X1+X2,data=base)
> summary(regression)

Call:
lm(formula = Y ~ 0 + X1 + X2, data = base)

Residuals:
Min      1Q  Median      3Q     Max
-4.3491 -0.8890 -0.0762  0.9601  3.6105

Coefficients:
Estimate Std. Error t value Pr(>|t|)
X1  0.45107    0.05924   7.615 6.34e-13 ***
X2 -0.41454    0.05924  -6.998 2.67e-11 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.449 on 236 degrees of freedom
Multiple R-squared: 0.2561,	Adjusted R-squared: 0.2497
F-statistic: 40.61 on 2 and 236 DF,  p-value: 6.949e-16

> regression$coefficients X1 X2 0.4510703 -0.4145365 > summary(regression)$sigma
[1] 1.449276
• using Yule-Walker equations

As we've seen in class, we can easily get the following equations for the autocovariance functions,

which can also be written

So we just have to solve a simple linear system of equations. Note that if we divide by the variance, those equations can be written in terms of the autocorrelation functions

The code is the following

> rho1=cor(Z[1:(n-1)],Z[2:n])
> rho2=cor(Z[1:(n-2)],Z[3:n])
> A=matrix(c(1,rho1,rho1,1),2,2)
> b=matrix(c(rho1,rho2),2,1)
> (PHI=solve(A,b))
[,1]
[1,]  0.4517579
[2,] -0.4155920

Now, we need to extract the estimated innovation process, from this set of parameters (note that it could be possible to include the variance term in Yule-Walker equations, to get a three dimensional linear equation)

> estWN=base$Y-(PHI[1]*base$X1+PHI[2]*base$X2) > sd(estWN) [1] 1.445706 This estimator is probably not the best one (we can take into account that we've lost two degrees of freedom), but as a starting point, let us consider this one. • using (conditional) likelihood estimators Finally, we can assume some distribution for the innovation process. The standard model is a Gaussian model, i.e. In that case, the conditional log likelihood (conditional since we set the first two observations here) is > CondLogLik=function(A,TS){ + phi1=A[1]; phi2=A[2] + sigma=A[3] ; L=0 + for(t in 3:length(TS)){ + L=L+dnorm(TS[t],mean=phi1*TS[t-1]+ + phi2*TS[t-2],sd=sigma,log=TRUE)} + return(-L)} Now, we can run standard optimization procedures, > LogL=function(A) CondLogLik(A,TS=Z) > optim(c(0,0,1),LogL)$par
[1]  0.4509685 -0.4144938  1.4430930

$value [1] 425.0164$counts
88       NA

$convergence [1] 0$message
NULL
Here, our three estimators are rather close. Actually, if we generate 1,000 time series (of size 240), those are the Box-plots of our three estimators, for the first order autoregressive coefficient

for the second one,

and finally for the standard deviation of the innovation process

All those estimators behave nicely, and are rather close. Note that they all might be biased, but they are consistent (see Davidson and MacKinnon for instance, in their book, for more details).

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...