An Utility Function For Monotonic Binning

December 2, 2018
By

(This article was first published on S+/R – Yet Another Blog in Statistical Computing, and kindly contributed to R-bloggers)

In all monotonic algorithms that I posted before, I heavily relied on the smbinning::smbinning.custom() function contributed by Herman Jopia as the utility function generating the binning output and therefore feel deeply indebted to his excellent work. However, the availability of smbinning::smbinning.custom() function shouldn’t become my excuse for being lazy. Over the weekend, I drafted a function, e.g. manual_bin(), serving the similar purpose.

Although it is not as flexible and elegant as Herman’s work, the manual_bin() function does have certain advantages of handling miss values and therefore improves the calculation of WoE and Information Value for missing values.
1. For the missing-value category, if there are both good and bad records, then this category will be considered a standalone bin.
2. For the missing-value category, if there are only either good or bad records but not both, then this category will be merged into the bin with lowest or highest bad rate. Therefore, WoE and IV for the missing value won’t be shown as “NaN” again.

In addition, the output of manual_bin() function also includes a set of rules that might be potentially applied to R dataframe in order to generate WoE transformations, on which I will show in the future.

To leave a comment for the author, please follow the link and comment on their blog: S+/R – Yet Another Blog in Statistical Computing.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)