**Econometrics Beat: Dave Giles' Blog**, and kindly contributed to R-bloggers)

What the title of this post is supposed to mean is: “Estimating a simple aggregate consumption function using Bayesian regression analysis”.

**recent post**I mentioned my long-standing interest in Bayesian Econometrics. When I teach this material I usually include a simple application that involves estimating a consumption function using U.S. time-series data. I used to have this coded up for the

**SHAZAM**package.

**EViews**isn’t appropriate as it doesn’t include a numerical integration routine.

You could use** BUGS**, or some other package, but it’s nice to see what is going on, step-by-step, when you’re encountering this stuff for the first time.

The other day, I thought, “It’s time to code this up in **R**“. So, here we are!

Here’s how I deal with the example. We have annual U.S. data on real private consumption expenditure and real disposable income, for the period 1950 to 1985. (See the **data page** for this blog.) We’re going to estimate a really simple consumption function. To reduce the number of parameters for which prior information has to be assigned, we’ll measure the data as deviations about sample means, thereby eliminating the intercept. So, the model is:

* C _{t}* =

*βY*+

_{t}*ε*;

_{t}*t*= 1, 2,……., 36

where *C* and *Y* denote consumption and income.

Let’s assume that the errors are independently and normally distributed.

*β*) must lie between zero and one in value. Otherwise the results make no economic sense. As far as

*σ*is concerned, all I know is that it must be positive and finite in value.

I’m going to put an improper (“diffuse”) marginal prior on *σ*, and a Beta marginal prior on *β*. The latter reflects the fact that this parameter must lie between zero and one. Assuming that the prior information about *σ* is independent of that about *β*, the joint prior for *β* and *σ* is:

*p*(

*β*,

*σ*) =

*p*(

*β*)

*p*(

*σ*)

* p*(*σ*) α (1 / *σ*) ; 0 < *σ < *∞

*p*(

*β*) α

*β*

^{(α – 1)}(1 –

*β*)

^{(γ – 1)}; 0 <

*β <*

*1 ;*

*α,*

*γ*> 0

where “α” denotes “is proportional to”.

The parameters, *α* and *γ*, of the marginal prior for *β* will be assigned values to reflect our prior information about the m.p.c., __before we see the current sample of data__.

Here’s how we’ll do that. Recall that the mean and variance of a Beta distribution are:

Mean = *m* = *α* / (*α* + *γ*)

Variance = *v *= (*α* *γ*) / [ (*α* + *γ*)^{2 }(*α* + *γ *+ 1) ] .

* α* = (*m* / *v*) (*m* (1 –* m*) –* v*)

* γ *= (1 – *m*) [*m* (1 – *m*) – *v*] /* v* .

Having chosen values for *m* and *v*, we have now fully specified the joint prior for *β *and *σ.*Given the normality of the errors, the likelihood function takes the form

*L*(

*β*,

*σ*| Data) =

*p*(Data |

*β*,

*σ*) α (1 /

*σ*)

^{n}exp[ -Σ (

*C*–

_{t}*βY*)

_{t}^{2 }] / (2

*σ*

^{2}) .

*β*and

*σ*is:

*p*(

*β*,

*σ*| Data) α

*p*(

*β*,

*σ*)

*L*(

*β*,

*σ*| Data)

α (1 /

*σ*)

^{(n + 1)}

*β*

^{(α – 1)}(1 –

*β*)

^{(γ – 1)}exp[ -Σ (

*C*–

_{t}*βY*)

_{t}^{2 }] / (2

*σ*

^{2}) .

*σ*, over the range (0 , ∞). This integration can be performed analytically, quite easily. The details can be found

**here**.

*marginal*posterior for

*β*is then given by:

*p*(

*β*| Data) α

*β*

^{(α – 1)}(1 –

*β*)

^{(γ – 1)}[ Σ (

*C*–

_{t}*βY*)

_{t}^{2 }]

^{(-n / 2)}.

*β*over the interval [0, 1], and then the reciprocal of the value of this integral is the normalizing constant.

*β*is readily done numerically, given the closed interval for the

*β*values.

*β*to

*m*=0.75 and

*v*= 0.0005 respectively. The latter values imply choices for the parameters of the prior for

*β*of

*α*= , and

*γ*=

This is what we get when we run the R code available in the** code page** for this blog:

*β*, we can compute the mean and variance of this p.d.f. as:

*β*| Data] = ∫

*β p*(

*β |*Data) d

*β*

*β*| Data] = ∫ (

*β*– E[

*β*| Data] )

^{2}

*p*(

*β |*Data) d

*β .*

*β.*

*β*| Data] = 0.8810, and Var[

*β*| Data] = 0.000219.

*β*| Data] is the Bayes estimator of

*β*. Other loss functions imply different Bayes estimators of

*β*.

*mode*is the Bayes estimator. It’s simple to check that the mode of p(

*β*| Data) occurs at

*β*= 0.8792. We get essentially the same estimator under either loss function

*in this particular example*.

*β*is concerned. This, in turn, is identical to the Bayes estimator we’d get if we used the following totally “diffuse” prior for

*β*and

*σ*:

* p*(*β*, *σ*) = *p*(*β*) p(*σ*),

*p*(

*β*) α constant ; -∞ <

*β <*

*∞*

*p*(

*σ*) α (1 /

*σ*) ; 0 <

*σ <*∞

We could consider this to be our “reference prior”. The associated results will enable us to see how much “influence” our chosen prior information had on the results given above.

The estimated m.p.c. is 0.89848, compared with the prior mean of 0.75 and the posterior mean of 0.8810. In this case of a single parameter for the mean function, the posterior mean lies between the prior mean and the mean of the likelihood function. (This doesn’t necessarily happen in every dimension when there is more than one regressor in the model.) The prior is “modified” by the data, *via* the likelihood function, to give us the posterior result.

Also notice that while the prior variance for *β *was 0.005, and the variance of the posterior distribution for *β*, is 0.00219. Adding the sample information to the prior information leads to a gain in estimation precision, as we’d expect.

*β*can also be constructed, but I won’t pursue that here.

Here are some results you get when you vary the prior information:

Hopefully, this example will be useful to some of you. Put on your Bayesian hat, and have fun!

[**Technical** ** Note:** The likelihood function that’s plotted above is actually the

*marginal*(

*or integrated*)

*likelihood function.*The full likelihood function is the j

*oint*data density, viewed as a function of the

*two parameters*,

*β*and

*σ.*So that I can plot the likelihood here, I’ve marginalized it by integrating out

*σ*. The integration to get the kernel of this function can be done analytically, using a similar approach to that used above to marginalize the posterior with respect to

*σ.*Then the normalizing constant is computed numerically.]

**leave a comment**for the author, please follow the link and comment on their blog:

**Econometrics Beat: Dave Giles' Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...