Site icon R-bloggers

To lend funds or not to lend is the question?

[This article was first published on Stories Data Speak, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

To lend funds or not to lend is the question?

The following analysis is based on a publicly available dataset hosted at Kaggle. The full code is located on my github

EXPLORATORY DATA ANALYSIS

Initial plots

Fig-1: Average annual income of applicants from WV and NM

Fig-2: Top 3 states with highest loan defaults

DATA SAMPLING

MODEL BUILDING

Model Summary statistcis as follws below,

Imbalanced data classification

          precision    recall  f1-score   support
       0       1.00      0.74      0.85        68
       1       0.95      1.00      0.98       358

    accuracy                        0.96       426
   macro avg       0.98      0.87   0.91       426
weighted avg       0.96      0.96   0.96       426

Resampled data shape:  (2856, 6975)
Balanced target
0    1428
1    1428
Name: target, dtype: int64

Balanced data using SMOTE

          precision    recall  f1-score   support

       0       0.98      0.90      0.94        68
       1       0.98      1.00      0.99       358

accuracy                                0.98       426
macro avg       	0.98      0.95      0.96       426
weighted avg        0.98      0.98      0.98       426

End notes

To develop a strategy for risk averse customers, the following points may be considered;

To leave a comment for the author, please follow the link and comment on their blog: Stories Data Speak.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version