Robust logistic regression

June 7, 2013
By

(This article was first published on Statistical Modeling, Causal Inference, and Social Science » R, and kindly contributed to R-bloggers)

Corey Yanofsky writes:

In your work, you’ve robustificated logistic regression by having the logit function saturate at, e.g., 0.01 and 0.99, instead of 0 and 1. Do you have any thoughts on a sensible setting for the saturation values? My intuition suggests that it has something to do with proportion of outliers expected in the data (assuming a reasonable model fit).

It would be desirable to have them fit in the model, but my intuition is that integrability of the posterior distribution might become an issue.

My reply: it should be no problem to put these saturation values in the model, I bet it would work fine in Stan if you give them uniform (0,.1) priors or something like that. Or you could just fit the robit model.

And this reminds me . . . I’ve been told that when Stan’s on its optimization setting, it fits generalized linear models just about as fast as regular glm or bayesglm in R. This suggests to me that we should have some precompiled regression models in Stan, then we could run all those regressions that way, and we could feel free to use whatever priors we want.

The post Robust logistic regression appeared first on Statistical Modeling, Causal Inference, and Social Science.

To leave a comment for the author, please follow the link and comment on his blog: Statistical Modeling, Causal Inference, and Social Science » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.