# Explaining the Almon Distributed Lag Model

**Econometrics Beat: Dave Giles' Blog**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

That post drew quite a number of email requests for more information about the Almon estimator, and how it fits into the overall scheme of things. In addition, Almon’s approach to modelling distributed lags has been used very effectively more recently in the estimation of the so-called MIDAS model. The MIDAS model (developed by Eric Ghysels and his colleagues – *e.g*., see Ghysels *et al*., 2004) is designed to handle regression analysis using data with different observation frequencies. The acronym, “MIDAS”, stands for “Mixed-Data Sampling”. The MIDAS model can be implemented in R, for instance (*e.g*., see here), as well as in EViews. (I discussed this in this earlier post.)

For these reasons I thought I’d put together this follow-up post by way of an introduction to the Almon DL model, and some of the advantages and pitfalls associated with using it.

Let’s take a look.

Suppose that we want to estimate the coefficients of the following DL model:

_{t}= β

_{0}x

_{t}+ β

_{1}x

_{t-1}+ β

_{2}x

_{t-2}+ …….. + β

_{n}x

_{t-n}+ u

_{t}; t = 1, 2, …., T. (1)

_{t}, satisfies all of the usual assumptions – but that can be relaxed too.

If the maximum lag length in the model, n, is much less than T, then we could just apply OLS to estimate the regression coefficients. However, even if this is feasible, in the sense that there are positive degrees of freedom, this may not be the smartest way in which to proceed. For most economic time-series, x, the successive lags of the variable are likely to be highly correlated with each other. Inevitably, this will result in quite severe multicollinearity.

How can we deal with this?

In response, Shirley Almon (1965) suggested a pretty neat way of re-formulating the model prior to its estimation. She made use of Weierstrass’s Approximation Theorem, which tells us (roughly) that: “Every continuous function defined on a closed interval [a, b] can be uniformly approximated, arbitrarily closely, by a polynomial function of finite degree, P.”

*doesn’t tell us*what the value of P will be. This presents a type of model-selection problem that we have to solve. The flip-side of this is that if we

*select*a value for P, and get it wrong, then there will be model mis-specification issues that we have to face. In fact, we can re-cast these issues in terms of those associated with the incorrect imposition of linear restrictions on the parameters of our model.

(Almon actually used Lagrangian interpolation in her application of Weierstrass’s Theorem to this problem, but there’s a simpler (and numerically equivalent) way of describing her idea.)

Let’s look into this in model in more detail.

Here’s equation (1) again:

_{t}= β

_{0}x

_{t}+ β

_{1}x

_{t-1}+ β

_{2}x

_{t-2}+ …….. + β

_{n}x

_{t-n}+ u

_{t}; t = 1, 2, …., T. (1)

_{i}, as unknown functions of “i” That is, we’ll set β

_{i}= g(i). Then we’ll approximate g(i) using a polynomial, f(i), of order P. Typically, P will take a small value, such 2, 3, or 4.

That is, we’ll write:

β

_{i}= a

_{0}+ a

_{1}i + a

_{2}i

^{2}+ …. + a

_{P}i

^{P}; i = 1, 2, ….., n (2)

Substituting (2) into (1), we get:

y_{t} = a_{0} x_{t} + (a_{0} + a_{1} + a_{2} + …. + a_{P}) x_{t-1} + (a_{0} + 2a_{1} + 4a_{2} + …. + 2^{P}a_{P}) x_{t-2} + ……..

+ (a_{0} + na_{1} + n^{2}a_{2} + …. + n^{P}a_{P}) x _{t-n} + u_{t} ; t = 1, 2, …., T. (3)

Re-arranging the right-hand side of (3), and gathering up terms, we get:

y_{t} = a_{0} (x_{t} + x_{t-1 }+ x_{t-2} + ……+ x _{t-n}) + a_{1} (x_{t-1} + 2x_{t-2} + …. + nx_{t-n}) + a_{2} (x_{t-1}+ 4x_{t-2} + 9x_{t-3 }+……

+ n^{2 }x_{t-n}) + ……… + a_{P }(x_{t-1} + 2^{P}x_{t-2} + …. + n^{P}x_{t-n}) + u_{t} ; t = 1, 2, …., T. (4)

If we’ve decided on a maximum lag-length (n), and we have chosen a degree (P) for the approximating polynomial, f(.), then we can re-write (4) as:

y_{t} = a_{0} z_{0t} + a_{1} z_{1t} + a_{2} z_{2t} + ……… + a_{P }z_{Pt} + u_{t} ; t = 1, 2, …., T. (5)

where:

z_{0t} = (x_{t} + x_{t-1 }+ x_{t-2} + ……+ x _{t-n})

z_{1t} = (x_{t-1} + 2x_{t-2} + …. + nx_{t-n})

z_{2t} = (x_{t-1 }+ 4x_{t-2} + 9x_{t-3 }+…… + n^{2 }x_{t-n})

.

.

.

z_{Pt} = (x_{t-1} + 2^{P}x_{t-2} + …. + n^{P}x_{t-n}) .

For a given n and P, we can construct the z variables, then estimate a_{0}, a_{1},….., a_{P} by applying OLS to (5), and finally “recover” the estimates for the β_{i}‘s using (2):

β_{i}*= a_{0}*+ a_{1}*i + a_{2}*i^{2} + …. + a_{P}*i^{P} ; i = 1, 2, ….., n (6)

where a * superscript denotes an OLS estimate.

Because the relationship between the β_{i}*’s and the aj*‘s in (6) is a linear one, it is trivial to “recover” the standard errors for the former estimates form the covariance matrix associated with the latter estimates.

**An Example**

Here’s the original model, again:

_{t}= β

_{0}x

_{t}+ β

_{1}x

_{t-1}+ β

_{2}x

_{t-2}+ …….. + β

_{n}x

_{t-n}+ u

_{t}; t = 1, 2, …., T. (7)

_{i}= a

_{0}+ a

_{1}i + a

_{2}i

^{2}; i = 1, 2, ….., n (8)

_{t}= a

_{0}x

_{t}+ (a

_{0}+ a

_{1}+ a

_{2}) x

_{t-1}+ (a

_{0}+ 2a

_{1}+ 4a

_{2}) x

_{t-2}+ …….. + (a

_{0}+ na

_{1 }+ n

^{2}a

_{2}) x

_{t-n}

_{t}; t = 1, 2, …., T

_{t}= a

_{0}(x

_{t}+ x

_{t-1}+ x

_{t-2}+ …….. + x

_{t-n}) + a

_{1 }(x

_{t-1}+ 2x

_{t-2}+ …….. + nx

_{t-n})

_{2 }(x

_{t-1}+ 4x

_{t-2}+ …….. + n

^{2}x

_{t-n}) + u

_{t}; t = 1, 2, …., T

_{t}= a

_{0}z

_{0t}+ a

_{1 }z

_{1t}+ a

_{2 }z

_{2t}

_{ }+ u

_{t}; t = 1, 2, …., T (9)

_{0t }= (x

_{t}+ x

_{t-1}+ x

_{t-2}+ …….. + x

_{t-n})

_{ }z_{1t} = (x_{t-1} + 2x_{t-2} + …….. + nx _{t-n})

_{2t}

_{ }= (x

_{t-1}+ 4x

_{t-2}+ …….. + n

^{2}x

_{t-n}) ; t = 1, 2, …., T

_{i}‘s using (8). Effectively, we now have (particular) restricted least squares estimates of the original coefficients in (7).

**An Extension**

_{i}‘s should follow. For instance, we may know that it makes sense for the lag weights to “die out” to zero when i = n+1. Or we may want the

*slope*of the lag distribution to be zero when i = n. There are lots of such pieces of prior information that we may want to impose on the problem, and some of these are discussed by Smith and Giles (1976), together with graphs and details of the associated formulae.

*add to*those already in play as a result of choosing a value for P. They further extend the chance that we may be imposing false restrictions on the parameter space, and this would lead our OLS estimates to be both biased and inconsistent. So, extreme care should be taken, and there are some important model-selection issues to be taken into account here.

_{i}= f(i) = a

_{0}+ a

_{1}i + a

_{2}i

^{2}, it follows that f ‘(i) = a

_{1}+ 2a

_{2}i, and we’re going to set

_{1}+ 2na

_{2 }= 0 ,

_{1}= – 2na

_{2}.

_{1}from the problem (that’s a linear restriction that we’re imposing, right there).

_{t}= a

_{0}z

_{0t}+ a

_{2 }(z

_{2t}

_{ }– 2n z

_{1t}) + u

_{t}; t = 1, 2, …., T (10)

_{i}*= a

_{0}*+ a

_{1}*i + a

_{2}*i

^{2 }; i = 1, 2, …., n.

- The Almon estimator provides a rather neat way of circumventing the multicollinearity problems that would arise if we simply estimated a DL model, with lots of lags, directly by OLS.
- It does this by approximating the “shape” of the distribution of the lag coefficients through time by a polynomial of order P.
- The value of P has to be chosen by the user, and this leads to a model-selection problem.
- The choice of P also affects the form of certain exact linear restrictions that are effectively being placed on the regression coefficients.
- This leads to the possibility that false restrictions are imposed, and this would lead to the resulting estimator being both biased and inconsistent.
- Additional restrictions can be placed on the lag distribution, based on our knowledge of the underlying economics of the relationship we’re estimating.
- Applying such restrictions should also be undertaken with care, again to avoid adversely affecting the properties of our estimator.

**References**

*Econometrica*, 33, 178-196.

*Applied Economics*, 9, 185-201.

Smith, R.G. & D.E.A. Giles,1976. The Almon estimator: Methodology and users’ guide. Discussion Paper E76/3, Economic Department, Reserve Bank of New Zealand.

*Sitzungsberichte der Königlich Preußischen Akademie der Wissenschaften zu Berlin*, (II).

**leave a comment**for the author, please follow the link and comment on their blog:

**Econometrics Beat: Dave Giles' Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.