Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Modellers/analysts developing credit scores generally use something known as the gains table (or a ks table) to measure and quantify the performance of such models. We’ll explore how to build such a table in this post.

The idea is to first discretise the population under consideration (say the testing or validation set) into groups based on the model’s output (probability/log odds/scores). Typically, this is done in a way such that each group represents 10% of the total population (or deciles). Then, summary statistics are generated for each group and cumulative distributions of events and non-events are analysed and the model’s performance is quantified.

We’ll use dplyr in this post. Doing it this way is nice since it’s easier to read and the code would need minimal changes if using sparklyr (say within a big data environment where one might need to run this directly on a hadoop table).

## Packages

Let’s get package installation out of the way first.

# Pacman is a package management tool
install.packages("pacman")

library(pacman)
## Warning: package 'pacman' was built under R version 4.1.1

# p_load automatically installs packages if needed


## Sample dataset

Here’s some sample data to play around with. The data set is small sample of the Lending Club dataset available on Kaggle.

sample <- read.csv("credit_sample.csv")

dim(sample)
## [1] 10000   153

class(sample)
## [1] "data.frame"


## Defining a target

First, we need to create a target (outcome) to model for. Since this is a credit risk use case, we are looking to create a target which identifies borrowers who defaulted on (or missed) their payments consecutively.



## Creating the Gains Table

The table has a few important components:

• Bins in decreasing/increasing order of model output (probability/log odds/scores)
• Population percentages contained in each bin
• Observed event rates in each bin
• Cumulative events and non events distributions
# Discretise predictions based on quantiles
q <- quantile(test$pred, probs = seq(0, 1, length.out = 11)) # Add bins to test dataset test$bins <- cut(test$pred, breaks = q, include.lowest = T, right = T, ordered_result = T)  levels(test$bins)
##  [1] "[-5.19,-3.4]"  "(-3.4,-2.96]"  "(-2.96,-2.68]" "(-2.68,-2.45]"
##  [5] "(-2.45,-2.28]" "(-2.28,-2.1]"  "(-2.1,-1.89]"  "(-1.89,-1.66]"
##  [9] "(-1.66,-1.28]" "(-1.28,0.458]"


Using the bins we created above, we can now start to put the table together

# Start with the test dataset and summarise
gains_table <- test %>%
group_by(bins) %>%
summarise(total = n(),


At this point the table should look something like this:

kable(gains_table)

bins total events non_events
[-5.19,-3.4] 300 6 294
(-3.4,-2.96] 300 13 287
(-2.96,-2.68] 300 10 290
(-2.68,-2.45] 300 19 281
(-2.45,-2.28] 300 26 274
(-2.28,-2.1] 300 35 265
(-2.1,-1.89] 300 45 255
(-1.89,-1.66] 300 52 248
(-1.66,-1.28] 300 67 233
(-1.28,0.458] 300 80 220

Next, we’ll add the event rate columns. Let’s also make the table presentable - I’ll use the percent() function in the scales package to show numbers as percentages.

gains_table %<>%
mutate(event_rate = percent(events / total, 0.1, 100))

kable(gains_table)

bins total events non_events event_rate
[-5.19,-3.4] 300 6 294 2.0%
(-3.4,-2.96] 300 13 287 4.3%
(-2.96,-2.68] 300 10 290 3.3%
(-2.68,-2.45] 300 19 281 6.3%
(-2.45,-2.28] 300 26 274 8.7%
(-2.28,-2.1] 300 35 265 11.7%
(-2.1,-1.89] 300 45 255 15.0%
(-1.89,-1.66] 300 52 248 17.3%
(-1.66,-1.28] 300 67 233 22.3%
(-1.28,0.458] 300 80 220 26.7%

To this we’ll add some columns quantifying how events and non events are distributed across each bin.

gains_table %<>%
mutate(pop_pct = percent(total/sum(total), 0.1, 100),

# Not formatting these as percentages just yet
c.events_pct = cumsum(events) / sum(events),
c.non_events_pct = cumsum(non_events) / sum(non_events))

kable(gains_table)

bins total events non_events event_rate pop_pct c.events_pct c.non_events_pct
[-5.19,-3.4] 300 6 294 2.0% 10.0% 0.0169972 0.1110691
(-3.4,-2.96] 300 13 287 4.3% 10.0% 0.0538244 0.2194938
(-2.96,-2.68] 300 10 290 3.3% 10.0% 0.0821530 0.3290518
(-2.68,-2.45] 300 19 281 6.3% 10.0% 0.1359773 0.4352097
(-2.45,-2.28] 300 26 274 8.7% 10.0% 0.2096317 0.5387231
(-2.28,-2.1] 300 35 265 11.7% 10.0% 0.3087819 0.6388364
(-2.1,-1.89] 300 45 255 15.0% 10.0% 0.4362606 0.7351719
(-1.89,-1.66] 300 52 248 17.3% 10.0% 0.5835694 0.8288629
(-1.66,-1.28] 300 67 233 22.3% 10.0% 0.7733711 0.9168870
(-1.28,0.458] 300 80 220 26.7% 10.0% 1.0000000 1.0000000

Almost done - just need a few more columns namely:

• A column computing the difference between the two cumulative distribution columns we created previously. The maximum value of this column will become the primary performance metric known as the KS statistic.
• Two additional columns computing event capture rates and cumulative event rates.
gains_table %<>%
mutate(ks = round(abs(c.events_pct - c.non_events_pct), 2),
cap_rate = percent(cumsum(events)/sum(events), 1, 100),
c_event_rate = percent(cumsum(events)/cumsum(total), 0.1, 100),

# Format pending columns
c.events_pct = percent(c.events_pct, 0.1, 100),
c.non_events_pct = percent(c.non_events_pct, 0.1, 100))

kable(gains_table)

bins total events non_events event_rate pop_pct c.events_pct c.non_events_pct ks cap_rate c_event_rate
[-5.19,-3.4] 300 6 294 2.0% 10.0% 1.7% 11.1% 0.09 2% 2.0%
(-3.4,-2.96] 300 13 287 4.3% 10.0% 5.4% 21.9% 0.17 5% 3.2%
(-2.96,-2.68] 300 10 290 3.3% 10.0% 8.2% 32.9% 0.25 8% 3.2%
(-2.68,-2.45] 300 19 281 6.3% 10.0% 13.6% 43.5% 0.30 14% 4.0%
(-2.45,-2.28] 300 26 274 8.7% 10.0% 21.0% 53.9% 0.33 21% 4.9%
(-2.28,-2.1] 300 35 265 11.7% 10.0% 30.9% 63.9% 0.33 31% 6.1%
(-2.1,-1.89] 300 45 255 15.0% 10.0% 43.6% 73.5% 0.30 44% 7.3%
(-1.89,-1.66] 300 52 248 17.3% 10.0% 58.4% 82.9% 0.25 58% 8.6%
(-1.66,-1.28] 300 67 233 22.3% 10.0% 77.3% 91.7% 0.14 77% 10.1%
(-1.28,0.458] 300 80 220 26.7% 10.0% 100.0% 100.0% 0.00 100% 11.8%

## Creating a function

Finally, we can encapsulate all of the above code in a single function. Note that we actually do not need the full test/train dataset, just the actual classes and predicted outcomes (log odds/probability/score).

gains_table <- function(act, pred, increasing = T, nBins = 10){

q <- quantile(pred, probs = seq(0, 1, length.out = nBins + 1))
bins <- cut(pred, breaks = q, include.lowest = T, right = T, ordered_result = T)

df <- data.frame(act, pred, bins)

df %>%

group_by(bins) %>%
summarise(total = n(),
events = sum(act == 1),
non_events = sum(act == 0)) %>%
mutate(event_rate = percent(events / total, 0.1, 100)) %>%

# This odd looking format is to ensure that the if-else
# condition is part of the dplyr chain
{if(increasing == TRUE){
arrange(., bins)
}else{
arrange(., desc(bins))
}} %>%

mutate(pop_pct = percent(total/sum(total), 0.1, 100),
c.events_pct = cumsum(events) / sum(events),
c.non_events_pct = cumsum(non_events) / sum(non_events),
ks = round(abs(c.events_pct - c.non_events_pct), 2),
cap_rate = percent(cumsum(events)/sum(events), 1, 100),
c_event_rate = percent(cumsum(events)/cumsum(total), 0.1, 100),
c.events_pct = percent(c.events_pct, 0.1, 100),
c.non_events_pct = percent(c.non_events_pct, 0.1, 100))
}


It is worth noting here that since the capture rate is being computed from top to bottom, it is important that the table is arranged in an appropriate manner. That is, when modelling for bads, the table should be arrange in descending order of the model output (i.e. higher event rates at the top) and vice versa.

Also, if you are planning on using this with sparklyr, consider looking into the ft_quantile_discretizer() function. It would replace cut() here.

# Test the function
tab <- gains_table(test$bad_flag, test$pred, F, 10)
kable(tab)

bins total events non_events event_rate pop_pct c.events_pct c.non_events_pct ks cap_rate c_event_rate
(-1.28,0.458] 300 80 220 26.7% 10.0% 22.7% 8.3% 0.14 23% 26.7%
(-1.66,-1.28] 300 67 233 22.3% 10.0% 41.6% 17.1% 0.25 42% 24.5%
(-1.89,-1.66] 300 52 248 17.3% 10.0% 56.4% 26.5% 0.30 56% 22.1%
(-2.1,-1.89] 300 45 255 15.0% 10.0% 69.1% 36.1% 0.33 69% 20.3%
(-2.28,-2.1] 300 35 265 11.7% 10.0% 79.0% 46.1% 0.33 79% 18.6%
(-2.45,-2.28] 300 26 274 8.7% 10.0% 86.4% 56.5% 0.30 86% 16.9%
(-2.68,-2.45] 300 19 281 6.3% 10.0% 91.8% 67.1% 0.25 92% 15.4%
(-2.96,-2.68] 300 10 290 3.3% 10.0% 94.6% 78.1% 0.17 95% 13.9%
(-3.4,-2.96] 300 13 287 4.3% 10.0% 98.3% 88.9% 0.09 98% 12.9%
[-5.19,-3.4] 300 6 294 2.0% 10.0% 100.0% 100.0% 0.00 100% 11.8%

## Interpretation

Some notes on how to interpret such a table:

• Since scoring models aim to risk-rank borrowers, the first thing to look for is whether or not the event rates are consistently increasing (or decreasing) across bins. If not, when using the actual model, one might not be able to confidently conclude if borrower A is better (or worse) than borrower B.
• If bin sizes are not consistent (in this case ~10%) it would imply that the model is assigning the same output/score to a lot of borrowers (clumping). This could pose issues later on (say when deciding cutoffs). The ideal solution is to add additional variables that can help differentiate between good and bad borrowers.
• While the ideal cutoff would be the bin where the KS statistic is at its maximum, additional aspects like capture rates and approval rates should be taken into account.
• Typically, analysts would look for a model which achieves the maximum value of the KS statistic within the first 2/3 deciles. That way, when creating underwriting policies, you would only end up rejecting 20%-30% of the applicant pool.