**Econometrics Beat: Dave Giles' Blog**, and kindly contributed to R-bloggers)

**Allocation Models**, drew a comment to the effect that in such models the dependent variables take values that must to be non-negative fractions. Well, as I responded, that’s true

*sometimes*(e.g., in the case of market shares); but not in other cases- such as the Engel curve example that I mentioned in the post.

*could*use OLS and then check that all of the within-sample predicted values are between zero and one. Better still, we could use a more suitable estimator – one that takes the restriction on the data values into account.

*automatically*, just as they are under OLS estimation.

First, consider a random variable, Y, which follows a Beta distribution, with shape parameters p and q, so that its density is:

f(y | p , q) = Γ(p + q) / [Γ(p) Γ(q)] y^{p – 1} (1 – y)^{q – 1} ; p, q > 0 ; 0 < y < 1

Now re-parameterize the distribution, using

μ = p / (p + q) ; where 0 < mu < 1

φ = (p + q) ; where phi > 0 .

The density of Y is now:

f(y | μ, φ) = Γ(φ) / [Γ(μφ) Γ(φ(1 – μ))] y^{μφ – 1} (1 – y)^{(1 – μ)φ – 1} ,

and E[Y] = μ ; var.[Y] = μ(1 – μ) / φ.

*mean*of Y. After all, this is what happens in a linear regression model, and it’s also what we do in, say, a Poisson regression model. The mean will then vary from observation to observation.

_{i}) = x

_{i}‘β , where β is a (k x 1) vector of parameters, and x

_{i}‘ is a row vector giving the ith. observation on each of the regressors. Various link-functions, g( . ), can be used. A particularly convenient one is the logit link:

_{i}= exp(x

_{i}‘β) / [1 + exp(x

_{i}‘β)] ; i = 1, 2, … , n.

*l*

_{i}(μ

_{i}, φ) = logΓ(φ) – logΓ(μ

_{i}φ) – logΓ[(1 – μ

_{i})φ] + (μ

_{i}φ – 1) log(y

_{i}) + [(1 – μ

_{i})φ – 1)] log(1 – y

_{i}) .

_{1}and y

_{2}are “share” variables, so 0 ≤ y

_{ji}≤ 1 ; for j = 1, 2 ; and i = 1, 2, …, n. (We could change the weak inequalities to strong inequalities without affecting anything, because the y’s are going to be continuous random variables.)

Also, (y

_{1i}+ y

_{2i}) = 1 ; for all i. Notice that as we have an intercept in each equation, at each point in the sample, the two dependent variables sum to one of the regressors. Now, let’s see what happens when we apply Beta regression to this simple allocation model.

Without going through the theory, let’s consider an empirical application of MLE in the context that we’re considering here. The EViews workfile and the R code that I’ve used are both on the

**code page**for this blog, and the (artificial) data are available on the

**data page**.

First, using EViews………………….

_{1}and α

_{2}for the two equations; and I’ve called the coefficients of the x regressor β

_{1}and β

_{2}in the two equations. φ

_{1}and φ

_{2}are the scale parameters.

I’ve created LOGL objects for each equation. The first one looks like this:

The second one has exactly the same style.

Here are MLE results for the first equation:

_{1i}= α

_{1}+ β

_{1}x

_{i}, and μ

_{2i}= α

_{2}+ β

_{2}x

_{i}? In this case, the results we get are:

*betareg*package (Ferrari and Cribari-Neto, 2010). The R code is

**here**. Here are the results, using the logit link function:

and

So, there we have it! You don’t have to use OLS to get the “adding up” results mentioned in the previous post. You can use Beta regression and MLE to allow for the fact that the dependent variables may be “shares”, and the results still hold.

**References**

**Ferrari, S. L. P. and F. Cribari-Neto**, 2004. Beta regression for modelling rates and proportions

*Journal of Applied Statistics*, 31, 799-815.

**Ferrari, S. L. P. and F. Cribari-Neto**, 2010. Beta regression in R.

*Journal of Statistical Software*, 34(2).

**leave a comment**for the author, please follow the link and comment on their blog:

**Econometrics Beat: Dave Giles' Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...