[This article was first published on Statistical Modeling, Causal Inference, and Social Science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

When predicting 0/1 data we can use logit (or probit or robit or some other robust model such as invlogit (0.01 + 0.98*X*beta)). Logit is simple enough and we can use bayesglm to regularize and avoid the problem of separation.

What if there are more than 2 categories? If they’re ordered (1, 2, 3, etc), we can do ordered logit (and use bayespolr() to avoid separation). If the categories are unordered (vanilla, chocolate, strawberry), there are unordered multinomial logit and probit models out there.

But it’s not so easy to fit these multinomial model in a multilevel setting (with coefficients that vary by group), especially if the computation is embedded in an iterative routine such as mi where you have real time constraints at each step.

So this got me wondering whether we could kluge it with logits. Here’s the basic idea (in the ordered and unordered forms):

– If you have a variable that goes 1, 2, 3, etc., set up a series of logits: 1 vs. 2,3,…; 2 vs. 3,…; and so forth. Fit each one with bayesglm (or whatever). The usual ordered logit is a special case of this model in which the coefficients (except for the constant term) are the same for each model. (At least, I’m guessing that’s what’s happening; if not, there are some equivalently-dimensioned constraints.) The simple alternative has no constraints. intermediate versions could link the models with soft constraints, some prior distribution on the coefficients. (This would need some hyperparameters but these could be estimated too if the whole model is being fit in some iterative context.)

– If you have a vanilla-chocolate-strawberry variable, do the same thing; just order the categories first, either in some reasonable way based on substantive information or else using some automatic rule such as putting the categories in decreasing order of frequency in the data. In any case, you’d first predict the probability of being in category 1, then the probability of being in 2 (given that you’re not in 1), then the probability of 3 (given not 1 or 2), and so forth.

Depending on your control over the problem, you could choose how to model the variables. For example, in a political survey with some missing ethnicity responses, you might model that variable as ordered: white/other/hispanic/black. In other contexts you might go unordered.

I recognized that my patchwork of logits is a bit of a hack, but I like its flexibility, as well as the computational simplicity of building it out of simple logits. Maybe there’s some literature on this (perhaps explaining why it’s a bad idea)?? I’d appreciate any comments or suggestions.