Allocation Models With Bounded Dependent Variables

Posted on July 5, 2013 by Dave Giles in R bloggers | 0 Comments

[This article was first published on Econometrics Beat: Dave Giles' Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

My post yesterday, on Allocation Models, drew a comment to the effect that in such models the dependent variables take values that must to be non-negative fractions. Well, as I responded, that’s true sometimes (e.g., in the case of market shares); but not in other cases- such as the Engel curve example that I mentioned in the post.

The anonymous comment was rather terse, but I’m presuming that the point that was intended is that if the y variables have to be positive fractions, we wouldn’t want to use OLS. Ideally, that’s so. Of course, we could use OLS and then check that all of the within-sample predicted values are between zero and one. Better still, we could use a more suitable estimator – one that takes the restriction on the data values into account.

The obvious solution is to assume that the errors, and hence the y values, follow a Beta distribution, and then estimate the equations by MLE. As I noted in my response to the comment, the “adding up” restictions that are needed on the parameters will be satisfied automatically, just as they are under OLS estimation.

Here’s a demonstration of this.

First, consider a random variable, Y, which follows a Beta distribution, with shape parameters p and q, so that its density is:

f(y | p , q) = Γ(p + q) / [Γ(p) Γ(q)] y^{p – 1} (1 – y)^{q – 1} ; p, q > 0 ; 0 < y < 1

Now re-parameterize the distribution, using

μ = p / (p + q) ; where 0 < mu < 1
φ = (p + q) ; where phi > 0 .

The density of Y is now:

f(y | μ, φ) = Γ(φ) / [Γ(μφ) Γ(φ(1 – μ))] y^{μφ – 1} (1 – y)^{(1 – μ)φ – 1} ,

and E[Y] = μ ; var.[Y] = μ(1 – μ) / φ.

Then, following Ferrari and Cribari-Neto (2004), we can introduce regressors to explain the mean of Y. After all, this is what happens in a linear regression model, and it’s also what we do in, say, a Poisson regression model. The mean will then vary from observation to observation.

Specifically, let g(μ_i) = x_i‘β , where β is a (k x 1) vector of parameters, and x_i‘ is a row vector giving the ith. observation on each of the regressors. Various link-functions, g( . ), can be used. A particularly convenient one is the logit link:

μ_i = exp(x_i‘β) / [1 + exp(x_i‘β)] ; i = 1, 2, … , n.

The ith. value for the log-likelihood function can be shown to be:

l_i(μ_i , φ) = logΓ(φ) – logΓ(μ_iφ) – logΓ[(1 – μ_i)φ] + (μ_iφ – 1) log(y_i) + [(1 – μ_i)φ – 1)] log(1 – y_i) .

It’s then straightforward to obtain the MLEs of φ and the β elements) by numerical methods.

Let’s start off where we did in the previous post, with a simple example involving a 2-equation allocation model. The regressors will just be an intercept and a single variable, x, but this simplification doesn’t affect anything.

I’ll assume that y₁ and y₂ are “share” variables, so 0 ≤ y_ji ≤ 1 ; for j = 1, 2 ; and i = 1, 2, …, n. (We could change the weak inequalities to strong inequalities without affecting anything, because the y’s are going to be continuous random variables.)

Also, (y_1i + y_2i) = 1 ; for all i. Notice that as we have an intercept in each equation, at each point in the sample, the two dependent variables sum to one of the regressors. Now, let’s see what happens when we apply Beta regression to this simple allocation model.

Without going through the theory, let’s consider an empirical application of MLE in the context that we’re considering here. The EViews workfile and the R code that I’ve used are both on the code page for this blog, and the (artificial) data are available on the data page.

First, using EViews………………….

I’ve called the intercept coefficients α₁ and α₂ for the two equations; and I’ve called the coefficients of the x regressor β₁ and β₂ in the two equations. φ₁ and φ₂ are the scale parameters.

I’ve created LOGL objects for each equation. The first one looks like this:

The second one has exactly the same style.

Here are MLE results for the first equation:

and the second equation:

Notice that the estimates of the two intercept coefficients sum to zero, and so do the estimates of the two slope coefficients. This is correct. Remember that we used the logit link function, and exp(0) = 1.

Also, notice that the estimates of the two scale parameters are the same in each equation. This corresponds to the singular covariance matrix that we saw in the earlier post. There are two equations, but only one scale parameter can be estimated freely.

What if we didn’t use the logit link function, but simply specified the means as μ_1i = α₁ + β₁x_i, and μ_2i = α₂ + β₂x_i? In this case, the results we get are:

and

In this case, the intercept coefficients sum to one, the slopes sum to zero, and once again the scale parameter estimates are identical across the equations.

The predicted mean functions sum to one across the two equations, regardless of the link function we use:

Now, let’s repeat the exercise using R. Specifically we’re going to use the betareg package (Ferrari and Cribari-Neto, 2010). The R code is here. Here are the results, using the logit link function:

and

So, there we have it! You don’t have to use OLS to get the “adding up” results mentioned in the previous post. You can use Beta regression and MLE to allow for the fact that the dependent variables may be “shares”, and the results still hold.

References

Ferrari, S. L. P. and F. Cribari-Neto, 2004. Beta regression for modelling rates and proportions Journal of Applied Statistics, 31, 799-815.

Ferrari, S. L. P. and F. Cribari-Neto, 2010. Beta regression in R. Journal of Statistical Software, 34(2).

To leave a comment for the author, please follow the link and comment on their blog: Econometrics Beat: Dave Giles' Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Allocation Models With Bounded Dependent Variables

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)