Site icon R-bloggers

Evaluating Binary Classification Models Using Gains Tables

[This article was first published on R'tichoke, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
< section id="introduction" class="level2">

Introduction

In credit risk modeling and binary classification applications, analysts employ gains tables (also known as KS tables) as a fundamental tool for measuring and quantifying model performance. This tutorial dives into the construction and interpretion of gains tables using R. ## Theoretical Foundation: Understanding Gains Tables

A gains table systematically discretizes the population (typically a validation or test dataset) into groups based on the model’s output predictions (probability scores, log odds, or risk scores). Each group conventionally represents 10% of the total population (deciles), though alternative binning strategies may be employed. The output presents summary statistics for each group and analyzes the cumulative distributions of events (defaults) and non-events to quantify the model’s discriminatory performance.

< section id="package-dependencies" class="level2">

Package Dependencies

# Load required packages
library(dplyr)
library(magrittr)
library(knitr)
library(scales)
< section id="dataset-preparation" class="level2">

Dataset Preparation

This tutorial utilizes a sample from the Lending Club dataset, which contains comprehensive loan information and associated outcomes suitable for credit risk modeling applications.

# Load the sample data
sample <- read.csv("https://bit.ly/42ypcnJ")

# Check dimensions
dim(sample)
[1] 10000   153
< section id="target-definition" class="level2">

Target Definition

The initial step requires the creation of a binary target variable for modeling purposes. In this credit risk application, we identify borrowers who defaulted on their loan obligations.

# Check unique loan statuses
unique(sample$loan_status)
[1] "Fully Paid"                                         
[2] "Current"                                            
[3] "Charged Off"                                        
[4] "Late (31-120 days)"                                 
[5] "Late (16-30 days)"                                  
[6] "In Grace Period"                                    
[7] "Does not meet the credit policy. Status:Fully Paid" 
[8] "Does not meet the credit policy. Status:Charged Off"
# Define "bad" loans as those that are charged off
codes <- c("Charged Off", "Does not meet the credit policy. Status:Charged Off")

# Create a binary flag for defaults
sample %<>% mutate(bad_flag = ifelse(loan_status %in% codes, 1, 0))

# Check overall event rates
sample %>% 
  summarise(events = sum(bad_flag == 1), 
            non_events = sum(bad_flag == 0)) %>% 
  mutate(            event_rate = events/(events + non_events))
  events non_events event_rate
1   1162       8838     0.1162
< section id="model-development" class="level2">

Model Development

Subsequently, we develop a logistic regression model to generate predictions that will serve as the foundation for gains table construction.

# Replace NA values with a default value
sample[is.na(sample)] <- -1

# Clean the data
sample %<>% 
  # Remove cases where home ownership and payment plan are not reported
  filter(!home_ownership %in% c("", "NONE"),
         pymnt_plan != "") %>% 
  # Convert categorical variables to factors
  mutate(home_ownership = factor(home_ownership), 
         pymnt_plan = factor(pymnt_plan))

# Train-test split (70-30)
idx <- sample(1:nrow(sample), size = 0.7 * nrow(sample), replace = FALSE)
train <- sample[idx,]
test <- sample[-idx,]
# Build a logistic regression model
mdl <- glm(
  formula = bad_flag ~ 
    loan_amnt + term + mths_since_last_delinq + total_pymnt + 
    home_ownership + acc_now_delinq + 
    inq_last_6mths + delinq_amnt + 
    mths_since_last_record + mths_since_recent_revol_delinq + 
    mths_since_last_major_derog + mths_since_recent_inq + 
    mths_since_recent_bc + num_accts_ever_120_pd,
  family = "binomial", 
  data = train
)

# Generate predictions on the test set
test$pred <- predict(mdl, newdata = test)
< section id="gains-table-construction" class="level2">

Gains Table Construction

The following section demonstrates the step-by-step construction of a comprehensive gains table through systematic binning and statistical analysis.

< section id="population-discretization-into-bins" class="level3">

Population Discretization into Bins

# Create deciles based on model predictions
q <- quantile(test$pred, probs = seq(0, 1, length.out = 11))

# Add bins to test dataset
test$bins <- cut(test$pred, breaks = q, include.lowest = TRUE, 
                right = TRUE, ordered_result = TRUE)

# Check the bin levels (note they're in increasing order)
levels(test$bins)
 [1] "[-5.11,-3.34]" "(-3.34,-2.89]" "(-2.89,-2.64]" "(-2.64,-2.41]"
 [5] "(-2.41,-2.23]" "(-2.23,-2.03]" "(-2.03,-1.83]" "(-1.83,-1.6]" 
 [9] "(-1.6,-1.26]"  "(-1.26,2.51]" 
< section id="basic-statistical-measures-by-segment" class="level3">

Basic Statistical Measures by Segment

# Create initial gains table with counts
gains_table <- test %>% 
  group_by(bins) %>% 
  summarise(total = n(), 
            events = sum(bad_flag == 1), 
            non_events = sum(bad_flag == 0))

# Add event rate column
gains_table %<>%
  mutate(event_rate = percent(events / total, 0.1, 100))

# Display the table
kable(gains_table)
bins total events non_events event_rate
[-5.11,-3.34] 300 3 297 1.0%
(-3.34,-2.89] 300 8 292 2.7%
(-2.89,-2.64] 300 10 290 3.3%
(-2.64,-2.41] 300 14 286 4.7%
(-2.41,-2.23] 300 26 274 8.7%
(-2.23,-2.03] 300 38 262 12.7%
(-2.03,-1.83] 300 46 254 15.3%
(-1.83,-1.6] 300 49 251 16.3%
(-1.6,-1.26] 300 58 242 19.3%
(-1.26,2.51] 300 74 226 24.7%
< section id="cumulative-distribution" class="level3">

Cumulative Distribution

# Add population percentage and cumulative distributions
gains_table %<>%
  mutate(pop_pct = percent(total/sum(total), 0.1, 100), 
         
         # Calculate cumulative percentages
         c.events_pct = cumsum(events) / sum(events),
         c.non_events_pct = cumsum(non_events) / sum(non_events))

# Display the updated table
kable(gains_table)
bins total events non_events event_rate pop_pct c.events_pct c.non_events_pct
[-5.11,-3.34] 300 3 297 1.0% 10.0% 0.0092025 0.1110696
(-3.34,-2.89] 300 8 292 2.7% 10.0% 0.0337423 0.2202693
(-2.89,-2.64] 300 10 290 3.3% 10.0% 0.0644172 0.3287210
(-2.64,-2.41] 300 14 286 4.7% 10.0% 0.1073620 0.4356769
(-2.41,-2.23] 300 26 274 8.7% 10.0% 0.1871166 0.5381451
(-2.23,-2.03] 300 38 262 12.7% 10.0% 0.3036810 0.6361257
(-2.03,-1.83] 300 46 254 15.3% 10.0% 0.4447853 0.7311144
(-1.83,-1.6] 300 49 251 16.3% 10.0% 0.5950920 0.8249813
(-1.6,-1.26] 300 58 242 19.3% 10.0% 0.7730061 0.9154824
(-1.26,2.51] 300 74 226 24.7% 10.0% 1.0000000 1.0000000
< section id="performance-metrics" class="level3">

Performance Metrics

# Add KS statistic, capture rate, and cumulative event rate
gains_table %<>%
  mutate(
    # KS statistic (difference between cumulative distributions)
    ks = round(abs(c.events_pct - c.non_events_pct), 2), 
    
    # Capture rate (percentage of total events captured)
    cap_rate = percent(cumsum(events)/sum(events), 1, 100), 
    
    # Cumulative event rate
    c_event_rate = percent(cumsum(events)/cumsum(total), 0.1, 100), 
    
    # Format percentage columns
    c.events_pct = percent(c.events_pct, 0.1, 100),
    c.non_events_pct = percent(c.non_events_pct, 0.1, 100))

# Display the final table
kable(gains_table)
bins total events non_events event_rate pop_pct c.events_pct c.non_events_pct ks cap_rate c_event_rate
[-5.11,-3.34] 300 3 297 1.0% 10.0% 0.9% 11.1% 0.10 1% 1.0%
(-3.34,-2.89] 300 8 292 2.7% 10.0% 3.4% 22.0% 0.19 3% 1.8%
(-2.89,-2.64] 300 10 290 3.3% 10.0% 6.4% 32.9% 0.26 6% 2.3%
(-2.64,-2.41] 300 14 286 4.7% 10.0% 10.7% 43.6% 0.33 11% 2.9%
(-2.41,-2.23] 300 26 274 8.7% 10.0% 18.7% 53.8% 0.35 19% 4.1%
(-2.23,-2.03] 300 38 262 12.7% 10.0% 30.4% 63.6% 0.33 30% 5.5%
(-2.03,-1.83] 300 46 254 15.3% 10.0% 44.5% 73.1% 0.29 44% 6.9%
(-1.83,-1.6] 300 49 251 16.3% 10.0% 59.5% 82.5% 0.23 60% 8.1%
(-1.6,-1.26] 300 58 242 19.3% 10.0% 77.3% 91.5% 0.14 77% 9.3%
(-1.26,2.51] 300 74 226 24.7% 10.0% 100.0% 100.0% 0.00 100% 10.9%
< section id="reusable-function" class="level2">

Reusable Function

The following implementation encapsulates the gains table construction process within a comprehensive, reusable function suitable for any binary classification model evaluation:

gains_table <- function(act, pred, increasing = TRUE, nBins = 10) {
  
  # Create bins based on predictions
  q <- quantile(pred, probs = seq(0, 1, length.out = nBins + 1))
  bins <- cut(pred, breaks = q, include.lowest = TRUE, right = TRUE, ordered_result = TRUE)
  
  df <- data.frame(act, pred, bins)
  
  df %>% 
    # Group by bins and calculate statistics
    group_by(bins) %>% 
    summarise(total = n(), 
              events = sum(act == 1), 
              non_events = sum(act == 0)) %>% 
    mutate(event_rate = percent(events / total, 0.1, 100)) %>% 
    
    # Sort the table based on the 'increasing' parameter
    {if(increasing == TRUE) {
      arrange(., bins)
    } else {
      arrange(., desc(bins))
    }} %>% 
    
    # Add all performance metrics
    mutate(pop_pct = percent(total/sum(total), 0.1, 100), 
           c.events_pct = cumsum(events) / sum(events),
           c.non_events_pct = cumsum(non_events) / sum(non_events), 
           ks = round(abs(c.events_pct - c.non_events_pct), 2), 
           cap_rate = percent(cumsum(events)/sum(events), 1, 100), 
           c_event_rate = percent(cumsum(events)/cumsum(total), 0.1, 100), 
           c.events_pct = percent(c.events_pct, 0.1, 100),
           c.non_events_pct = percent(c.non_events_pct, 0.1, 100))
}
< section id="function-implementation" class="level3">

Function Implementation

# Generate a gains table with bins in descending order
tab <- gains_table(test$bad_flag, test$pred, FALSE, 10)
kable(tab)
bins total events non_events event_rate pop_pct c.events_pct c.non_events_pct ks cap_rate c_event_rate
(-1.26,2.51] 300 74 226 24.7% 10.0% 22.7% 8.5% 0.14 23% 24.7%
(-1.6,-1.26] 300 58 242 19.3% 10.0% 40.5% 17.5% 0.23 40% 22.0%
(-1.83,-1.6] 300 49 251 16.3% 10.0% 55.5% 26.9% 0.29 56% 20.1%
(-2.03,-1.83] 300 46 254 15.3% 10.0% 69.6% 36.4% 0.33 70% 18.9%
(-2.23,-2.03] 300 38 262 12.7% 10.0% 81.3% 46.2% 0.35 81% 17.7%
(-2.41,-2.23] 300 26 274 8.7% 10.0% 89.3% 56.4% 0.33 89% 16.2%
(-2.64,-2.41] 300 14 286 4.7% 10.0% 93.6% 67.1% 0.26 94% 14.5%
(-2.89,-2.64] 300 10 290 3.3% 10.0% 96.6% 78.0% 0.19 97% 13.1%
(-3.34,-2.89] 300 8 292 2.7% 10.0% 99.1% 88.9% 0.10 99% 12.0%
[-5.11,-3.34] 300 3 297 1.0% 10.0% 100.0% 100.0% 0.00 100% 10.9%
< section id="interpretation" class="level2">

Interpretation

A properly constructed gains table provides multiple critical insights into model performance characteristics:

  1. Monotonicity Assessment: Event rates should demonstrate consistent increases (or decreases) across bins, confirming the model’s effectiveness in rank-ordering risk levels.

  2. Population Distribution: Consistent bin sizes (ideally ~10% each) indicate appropriate score distribution. Inconsistent sizes suggest score clustering, which may complicate threshold determination.

  3. Kolmogorov-Smirnov (KS) Statistic: The maximum KS value represents the model’s discriminatory power. Higher values (approaching 1.0) indicate superior separation between positive and negative cases.

  4. Capture Rate: Demonstrates the percentage of total events captured at each threshold, essential for operational decision-making.

  5. Cumulative Event Rate: Indicates the event rate among all cases up to each bin, facilitating approval threshold establishment.

< section id="applications-in-credit-risk-analytics" class="level2">

Applications in Credit Risk Analytics

Gains tables serve multiple critical functions in credit risk management environments:

  1. Threshold Optimization: Identification of appropriate score thresholds for automated approval or rejection decisions.

  2. Tiered Strategy Development: Construction of multi-tier decision strategies (approve, manual review, decline) based on quantified risk levels.

  3. Model Performance Monitoring: Longitudinal tracking of model performance through comparison of actual versus expected distributions.

  4. Comparative Model Evaluation: Systematic comparison of alternative models through KS statistics and capture rate analysis.

To leave a comment for the author, please follow the link and comment on their blog: R'tichoke.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version