Site icon R-bloggers

Bias reduction in Poisson and Tobit regression

[This article was first published on Achim Zeileis, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

While it is well-known that data separation can cause infinite estimates in binary regression models, the same is also true for other models with a point mass at the bounday (typically at zero) such as Poisson and Tobit. It is shown why this happens and how it can be remedied with bias-reduced estimation, along with implementations in R.

Citation

Köll S, Kosmidis I, Kleiber C, Zeileis A (2021). “Bias Reduction as a Remedy to the Consequences of Infinite Estimates in Poisson and Tobit Regression”, arXiv:2101.07141, arXiv.org E-Print Archive. https://arXiv.org/abs/2101.07141

Abstract

Data separation is a well-studied phenomenon that can cause problems in the estimation and inference from binary response models. Complete or quasi-complete separation occurs when there is a combination of regressors in the model whose value can perfectly predict one or both outcomes. In such cases, and such cases only, the maximum likelihood estimates and the corresponding standard errors are infinite. It is less widely known that the same can happen in further microeconometric models. One of the few works in the area is Santos Silva and Tenreyro (2010) who note that the finiteness of the maximum likelihood estimates in Poisson regression depends on the data configuration and propose a strategy to detect and overcome the consequences of data separation. However, their approach can lead to notable bias on the parameter estimates when the regressors are correlated. We illustrate how bias-reducing adjustments to the maximum likelihood score equations can overcome the consequences of separation in Poisson and Tobit regression models.

Software

R package brglm2 from CRAN: https://CRAN.R-project.org/package=brglm2

R package brtobit from R-Forge: https://R-Forge.R-project.org/R/?group_id=2305

Illustration

The simplest but arguably often-encountered occurrence of data separation in practice is when there is a binary regressor such that the response y = 0 (or another boundary value) whenever the regressor is 1. If P(y = 0) is monotonically increasing in the linear predictor of the model, then the coefficient of the binary regressor will diverge to minus infinity in order to push P(y = 0) in this subgroup as close to 1 as possible.

To illustrate this phenomenon in R for both Poisson and Tobit regression we employ a simple data-generating process: In addition to the intercept we generate a continuous regressor x2 uniformly distributed on [-1, 1] and a binary regressor x3. The latter comes from a Bernoulli distribution with probability 0.25 if x2 is positive and with probability 0.75 otherwise. Thus, x2 and x3 are correlated.

The linear predictor employed for both Poisson and Tobit is: 1 + x2 – 10 x3, where the extreme coefficient of -10 assures that there is almost certainly data separation. In the full paper linked above we also consider less extreme scenarios where separation may or may not occur. The Poisson response is then drawn from a Poisson distribution using a log link between mean and linear predictor. The Tobit response is drawn from a normal distribution censored at zero with identity link and constant variance of 2. Here, we draw two samples with 100 observations from both models:

dgp <- function(n = 100, coef = c(1, 1, -10, 2), prob = 0.25, dist = "poisson") {
  x2 <- runif(n, -1, 1)
  x3 <- rbinom(n, size = 1, prob = ifelse(x2 > 0, prob, 1 - prob))
  y <- switch(match.arg(tolower(dist), c("poisson", "tobit")),
    "poisson" = rpois(n, exp(coef[1] + coef[2] * x2 + coef[3] * x3)),
    "tobit" = rnorm(n, mean = coef[1] + coef[2] * x2 + coef[3] * x3, sd = sqrt(coef[4]))
  )
  y[y <= 0] <- 0
  data.frame(y, x2, x3)
}

set.seed(2020-10-29)
d1 <- dgp(dist = "poisson")
set.seed(2020-10-29)
d2 <- dgp(dist = "tobit")

Both of these data sets exhibit quasi-complete separation of y with respect to x3, i.e., y is always 0 if x3 is 1.

xtabs(~ x3 + factor(y == 0), data = d1)
##    factor(y == 0)
## x3  FALSE TRUE
##   0    47    8
##   1     0   45

We then compare four different modeling approaches in this situation:

For Poisson regression, all these models can be fitted with the standard glm() function in R. To obtain the BR estimate method = "brglmFit" can be plugged in using the brglm2 package (by Ioannis Kosmidis).

install.packages("brglm2")
library("brglm2")

m12_ml <- glm(y ~ x2 + x3, data = d1, family = poisson)
m12_br <- update(m12_ml, method = "brglmFit")
m1_all <- glm(y ~ x2, data = d1, family = poisson)
m1_sub <- update(m1_all, subset = x3 == 0)
m1 <- list("ML" = m12_ml, "BR" = m12_br, "ML/sub" = m1_sub, "ML/SST" = m1_all)

This yields the following results (shown with the wonderful modelsummary package):

library("modelsummary")
msummary(m1)
  ML BR ML/sub ML/SST
(Intercept) 0.951 0.958 0.951 0.350
  (0.100) (0.099) (0.100) (0.096)
x2 1.011 1.006 1.011 1.662
  (0.158) (0.157) (0.158) (0.144)
x3 -20.907 -5.174    
  (2242.463) (1.416)    
Num.Obs. 100 100 55 100
Log.Lik. -107.364 -107.869 -107.364 -169.028

The following remarks can be made:

Moreover, in more extensive simulation experiments in the paper it is shown that the BR estimates are always finite, and result in Wald-type intervals with better coverage probabilities.

Analogous results an be obtained for Tobit regression with our brtobit package, currently available from R-Forge. This provides both ML and BR estimation for homoscedastic tobit models. (Some tools are re-used from our crch package that implements various estimation techniques , albeit not BR, for Tobit models with conditional heteroscedasticity.) Below we fit the same four models as in the Poisson case above.

install.packages("brtobit", repos = "http://R-Forge.R-project.org")
library("brtobit")

m22_ml <- brtobit(y ~ x2 + x3, data = d2, type = "ML", fsmaxit = 28)
m22_br <- brtobit(y ~ x2 + x3, data = d2, type = "BR")
m2_all <- brtobit(y ~ x2, data = d2, type = "ML")
m2_sub <- update(m2_all, subset = x3 == 0)
m2 <- list("ML" = m22_ml, "BR" = m22_br, "ML/sub" = m2_sub, "ML/SST" = m2_all)

Because brtobit does not yet provide a direct interface for modelsummary (via broom) we go through the coeftest() results as an intermediate step. These can then be rendered by modelsummary:

library("lmtest")
m2 <- lapply(m2, coeftest)
msummary(m2)
  ML BR ML/sub ML/SST
(Intercept) 1.135 1.142 1.135 -0.125
  (0.208) (0.210) (0.208) (0.251)
x2 0.719 0.705 0.719 2.074
  (0.364) (0.359) (0.364) (0.404)
x3 -11.238 -4.218    
  (60452.270) (0.891)    
(Variance) 1.912 1.970 1.912 3.440
  (0.422) (0.434) (0.422) (0.795)
Num.Obs. 100 100 55 100
Log.Lik. -87.633 -88.101 -87.633 -118.935

The results show exactly the same pattern as for the Poisson regression above: ML, BR, and ML/sub yield results close to the true coefficients for intercept, x2, and the variance while the ML/SST estimates are far from the true values. For x3 only the BR estimates are finite while the ML estimates diverge towards minus infinity. Actually, the estimates would have diverged even more if we hadn’t stopped the Fisher scoring early (via fsmaxit = 28 instead of the default 100).

Overall this clearly indicates that bias-reduced (BR) estimation is a convenient way to avoid infinite estimates and standard errors in these models and to enable standard inference even when data separation occurs. In contrast the common recommendation to omit the regressor associated with the separation should be avoided or applied to the non-separated subset of observations only. Otherwise it can give misleading results when regressors are correlated.

To leave a comment for the author, please follow the link and comment on their blog: Achim Zeileis.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.