An Example of a Calibrated Model that is not Fully Calibrated

Posted on October 28, 2020 by jmount in R bloggers | 0 Comments

[This article was first published on R – Win Vector LLC, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In our last note we mentioned the possibility of “fully calibrated models.” This note is an example of a probability model that is calibrated in the traditional sense, but not fully calibrated in a finer grained sense.

First let’s attach our packages and generate our example data in R.

library(wrapr)

d <- build_frame(
   "x1"  , "x2", "y"   |
     1   , 1   , TRUE  |
     1   , 0   , FALSE |
     1   , 0   , TRUE  |
     1   , 1   , FALSE |
     0   , 0   , TRUE  |
     0   , 1   , TRUE  |
     0   , 1   , FALSE |
     0   , 0   , FALSE |
     0   , 0   , TRUE  )
# cat(wrapr::draw_frame(d))

knitr::kable(d)

x1	x2	y
1	1	TRUE
1	0	FALSE
1	0	TRUE
1	1	FALSE
0	0	TRUE
0	1	TRUE
0	1	FALSE
0	0	FALSE
0	0	TRUE

Now, we fit our logistic regression model.

model <- glm(y ~ x1 + x2, 
             data = d, 
             family = binomial())
summary(model)

## 
## Call:
## glm(formula = y ~ x1 + x2, family = binomial(), data = d)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.4213  -1.2572   0.9517   1.0996   1.2572  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   0.5572     1.0784   0.517    0.605
## x1           -0.3715     1.3644  -0.272    0.785
## x2           -0.3715     1.3644  -0.272    0.785
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 12.365  on 8  degrees of freedom
## Residual deviance: 12.201  on 6  degrees of freedom
## AIC: 18.201
## 
## Number of Fisher Scoring iterations: 4

We land our model predictions as a new column.

d$prediction <- predict(model, 
                        newdata = d, 
                        type = 'response')
knitr::kable(d)

x1	x2	y	prediction
1	1	TRUE	0.4537010
1	0	FALSE	0.5462990
1	0	TRUE	0.5462990
1	1	FALSE	0.4537010
0	0	TRUE	0.6358007
0	1	TRUE	0.5462990
0	1	FALSE	0.5462990
0	0	FALSE	0.6358007
0	0	TRUE	0.6358007

We can see this model is calibrated or unbiased in the sense E[prediction] = E[outcome].

colMeans(d[, qc(y, prediction)]) %.>%
  knitr::kable(.)

	x
y	0.5555556
prediction	0.5555556

And it is even calibrated in the sense we expect for logistic regression, E[prediction * x] = E[outcome * x] (where x is any explanatory variable).

for(v in qc(x1, x2)) {
  print(paste0(
    v, ' diff: ', 
    mean(d[[v]] * d$prediction) - mean(d[[v]] * d$y)))
}

## [1] "x1 diff: 2.77555756156289e-17"
## [1] "x2 diff: 5.55111512312578e-17"

However, we can see this model is not “fully calibrated” in an additional sense requiring that E[outcome | prediction] = prediction for all observed values of prediction.

cal <- aggregate(y ~ prediction, data = d, FUN = mean)

knitr::kable(cal)

prediction	y
0.4537010	0.5000000
0.5462990	0.5000000
0.6358007	0.6666667

We can re-calibrate such a model (in practice you would want to do this on out of sample data using isotonic regression, or using cross-frame methods to avoid nested model bias).

cal_map <- cal$prediction := cal$y
d$calibrated <- cal_map[as.character(d$prediction)]

knitr::kable(d)

x1	x2	y	prediction	calibrated
1	1	TRUE	0.4537010	0.5000000
1	0	FALSE	0.5462990	0.5000000
1	0	TRUE	0.5462990	0.5000000
1	1	FALSE	0.4537010	0.5000000
0	0	TRUE	0.6358007	0.6666667
0	1	TRUE	0.5462990	0.5000000
0	1	FALSE	0.5462990	0.5000000
0	0	FALSE	0.6358007	0.6666667
0	0	TRUE	0.6358007	0.6666667

This new calibrated prediction is also calibrated in the standard sense.

colMeans(d[, qc(y, prediction, calibrated)]) %.>%
  knitr::kable(.)

	x
y	0.5555556
prediction	0.5555556
calibrated	0.5555556

And, at least in this case, still obeys the explanatory roll-up conditions.

for(v in qc(x1, x2)) {
  print(paste0(
    v, ' diff: ', 
    mean(d[[v]] * d$calibrated) - mean(d[[v]] * d$y)))
}

## [1] "x1 diff: 0"
## [1] "x2 diff: 0"

The new calibrated predictions are even of lower deviance than the original predictions.

deviance <- function(prediction, truth) {
  sum(-2 * (truth * log(prediction) + 
              (1 - truth) * log(1 - prediction)))
}

deviance(prediction = d$prediction, truth = d$y)

## [1] 12.20102

deviance(prediction = d$calibrated, truth = d$y)

## [1] 12.13685

The reason the original logistic model could not make the calibrated predictions is: the calibrated predictions are not a linear function of the explanatory variables in link space.

To leave a comment for the author, please follow the link and comment on their blog: R – Win Vector LLC.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

An Example of a Calibrated Model that is not Fully Calibrated

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)