Blog Archives

A Summary of My Home-Brew Binning Algorithms for Scorecard Development

March 10, 2019
By

Thus far, I have published four different monotonic binning algorithms for the scorecard development and think that it might be a right timing to do a quick summary. R functions for these binning algorithms are also available on https://github.com/statcompute/MonotonicBinning. The first one was posted back in 2017 (https://statcompute.wordpress.com/2017/01/22/monotonic-binning-with-smbinning-package) based on my SAS macro (https://statcompute.wordpress.com/2012/06/10/a-sas-macro-implementing-monotonic-woe-transformation-in-scorecard-development) that

Read more »

Bayesian Optimization for Hyper-Parameter

February 24, 2019
By

In past several weeks, I spent a tremendous amount of time on reading literature about automatic parameter tuning in the context of Machine Learning (ML), most of which can be classified into two major categories, e.g. search and optimization. Searching mechanisms, such as grid search, random search, and Sobol sequence, can be somewhat computationally expensive.

Read more »

Gradient-Free Optimization for GLMNET Parameters

February 23, 2019
By

In the post https://statcompute.wordpress.com/2017/09/03/variable-selection-with-elastic-net, it was shown how to optimize hyper-parameters, namely alpha and gamma, of the glmnet by using the built-in cv.glmnet() function. However, following a similar logic of hyper-parameter optimization shown in the post https://statcompute.wordpress.com/2019/02/10/direct-optimization-of-hyper-parameter, we can directly optimize alpha and gamma parameters of the glmnet by using gradient-free optimizations, such as Nelder–Mead

Read more »

Direct Optimization of Hyper-Parameter

February 10, 2019
By

In the previous post (https://statcompute.wordpress.com/2019/02/03/sobol-sequence-vs-uniform-random-in-hyper-parameter-optimization), it is shown how to identify the optimal hyper-parameter in a General Regression Neural Network by using the Sobol sequence and the uniform random generator respectively through the N-fold cross validation. While the Sobol sequence yields a slightly better performance, outcomes from both approaches are very similar, as shown below

Read more »

Sobol Sequence vs. Uniform Random in Hyper-Parameter Optimization

February 3, 2019
By
Sobol Sequence vs. Uniform Random in Hyper-Parameter Optimization

Tuning hyper-parameters might be the most tedious yet crucial in various machine learning algorithms, such as neural networks, svm, or boosting. The configuration of hyper-parameters not only impacts the computational efficiency of a learning algorithm but also determines its prediction accuracy. Thus far, manual tuning and grid searching are still the most prevailing strategies. In

Read more »

Co-integration and Mean Reverting Portfolio

January 5, 2019
By

In the previous post https://statcompute.wordpress.com/2018/07/29/co-integration-and-pairs-trading, it was shown how to identify two co-integrated stocks in the pair trade. In the example below, I will show how to form a mean reverting portfolio with three or more stocks, e.g. stocks with co-integration, and also how to find the linear combination that is stationary for these stocks.

Read more »

Statistical Assessments of AUC

December 25, 2018
By

In the scorecard development, the area under ROC curve, also known as AUC, has been widely used to measure the performance of a risk scorecard. Given everything else equal, the scorecard with a higher AUC is considered more predictive than the one with a lower AUC. However, little attention has been paid to the statistical

Read more »

Phillips-Ouliaris Test For Cointegration

December 16, 2018
By

In a project of developing PPNR balance projection models, I tried to use the Phillips-Ouliaris (PO) test to investigate the cointegration between the historical balance and a set of macro-economic variables and noticed that implementation routines of PO test in various R packages, e.g. urca and tseries, would give different results. After reading through the

Read more »

An Utility Function For Monotonic Binning

December 2, 2018
By

In all monotonic algorithms that I posted before, I heavily relied on the smbinning::smbinning.custom() function contributed by Herman Jopia as the utility function generating the binning output and therefore feel deeply indebted to his excellent work. However, the availability of smbinning::smbinning.custom() function shouldn’t become my excuse for being lazy. Over the weekend, I drafted a

Read more »

Improving Binning by Bootstrap Bumping

November 25, 2018
By

In the post (https://statcompute.wordpress.com/2018/11/23/more-robust-monotonic-binning-based-on-isotonic-regression), a more robust version of monotonic binning based on the isotonic regression was introduced. Nonetheless, due to the loss of granularity, the predictability has been somewhat compromised, which is a typical dilemma in the data science. On one hand, we don’t want to use a learning algorithm that is too greedy

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)