What the title of this post is supposed to mean is: “Estimating a simple aggregate consumption function using Bayesian regression analysis”.
You could use BUGS, or some other package, but it’s nice to see what is going on, step-by-step, when you’re encountering this stuff for the first time.
The other day, I thought, “It’s time to code this up in R“. So, here we are!
Here’s how I deal with the example. We have annual U.S. data on real private consumption expenditure and real disposable income, for the period 1950 to 1985. (See the data page for this blog.) We’re going to estimate a really simple consumption function. To reduce the number of parameters for which prior information has to be assigned, we’ll measure the data as deviations about sample means, thereby eliminating the intercept. So, the model is:
Ct = βYt + εt ; t = 1, 2,……., 36
where C and Y denote consumption and income.
Let’s assume that the errors are independently and normally distributed.
I’m going to put an improper (“diffuse”) marginal prior on σ, and a Beta marginal prior on β. The latter reflects the fact that this parameter must lie between zero and one. Assuming that the prior information about σ is independent of that about β, the joint prior for β and σ is:
p(σ) α (1 / σ) ; 0 < σ < ∞
where “α” denotes “is proportional to”.
The parameters, α and γ, of the marginal prior for β will be assigned values to reflect our prior information about the m.p.c., before we see the current sample of data.
Here’s how we’ll do that. Recall that the mean and variance of a Beta distribution are:
Mean = m = α / (α + γ)
Variance = v = (α γ) / [ (α + γ)2 (α + γ + 1) ] .
α = (m / v) (m (1 – m) – v)
γ = (1 – m) [m (1 – m) – v] / v .
Having chosen values for m and v, we have now fully specified the joint prior for β and σ.
Given the normality of the errors, the likelihood function takes the form
α (1 / σ)(n + 1) β (α – 1)(1 – β)(γ – 1) exp[ -Σ (Ct – βYt)2 ] / (2σ2) .
This is what we get when we run the R code available in the code page for this blog:
p(β, σ) = p(β) p(σ),
We could consider this to be our “reference prior”. The associated results will enable us to see how much “influence” our chosen prior information had on the results given above.
The estimated m.p.c. is 0.89848, compared with the prior mean of 0.75 and the posterior mean of 0.8810. In this case of a single parameter for the mean function, the posterior mean lies between the prior mean and the mean of the likelihood function. (This doesn’t necessarily happen in every dimension when there is more than one regressor in the model.) The prior is “modified” by the data, via the likelihood function, to give us the posterior result.
Also notice that while the prior variance for β was 0.005, and the variance of the posterior distribution for β, is 0.00219. Adding the sample information to the prior information leads to a gain in estimation precision, as we’d expect.
Here are some results you get when you vary the prior information:
Hopefully, this example will be useful to some of you. Put on your Bayesian hat, and have fun!
[Technical Note: The likelihood function that’s plotted above is actually the marginal (or integrated) likelihood function. The full likelihood function is the joint data density, viewed as a function of the two parameters, β and σ. So that I can plot the likelihood here, I’ve marginalized it by integrating out σ. The integration to get the kernel of this function can be done analytically, using a similar approach to that used above to marginalize the posterior with respect to σ. Then the normalizing constant is computed numerically.]