Articles by statcompute

YAP: Yet Another Probabilistic Neural Network

December 25, 2019 | statcompute

By the end of 2019, I finally managed to wrap up my third R package YAP (https://github.com/statcompute/yap) that implements the Probabilistic Neural Network (Specht, 1990) for the N-category pattern recognition with N __ 3. Similar to GRNN, PNN shares same benefits of instantaneous training, simple structure, and global convergence. Below ... [Read more...]

GRNN with Small Samples

November 29, 2019 | statcompute

After a bank launches a new product or acquires a new portfolio, the risk modeling team would often be faced with a challenge of how to estimate the corresponding performance, e.g. risk or loss, with a limited number of data points conditional on business drivers or macro-economic indicators. For ... [Read more...]

GRNN vs. GAM

October 26, 2019 | statcompute

In practice, GRNN is very similar to GAM (Generalized Additive Models) in the sense that they both shared the flexibility of approximating non-linear functions. In the example below, both GRNN and GAM were applied to the Kyphosis data that has been widely experimented in examples of GAM and revealed very ...
[Read more...]

Permutation Feature Importance (PFI) of GRNN

October 19, 2019 | statcompute

In the post https://statcompute.wordpress.com/2019/10/13/assess-variable-importance-in-grnn, it was shown how to assess the variable importance of a GRNN by the decrease in GoF statistics, e.g. AUC, after averaging or dropping the variable of interest. The permutation feature importance evaluates the variable importance in a similar manner by ...
[Read more...]

Partial Dependence Plot (PDP) of GRNN

October 19, 2019 | statcompute

The function grnn.margin() (https://github.com/statcompute/yager/blob/master/code/grnn.margin.R) was my first attempt to explore the relationship between each predictor and the response in a General Regression Neural Network, which usually is considered the Black-Box model. The idea is described below: First trained a ...
[Read more...]

Merge MLP And CNN in Keras

October 14, 2019 | statcompute

In the post (https://statcompute.wordpress.com/2017/01/08/an-example-of-merge-layer-in-keras), it was shown how to build a merge-layer DNN by using the Keras Sequential model. In the example below, I tried to scratch a merge-layer DNN with the Keras functional API in both R and Python. In particular, the merge-layer DNN is ...
[Read more...]

Assess Variable Importance In GRNN

October 13, 2019 | statcompute

Technically speaking, there is no need to evaluate the variable importance and to perform the variable selection in the training of a GRNN. It’s also been a consensus that the neural network is a black-box model and it is not an easy task to assess the variable importance in ...
[Read more...]

Hyper-Parameter Optimization of General Regression Neural Networks

October 12, 2019 | statcompute

A major advantage of General Regression Neural Networks (GRNN) over other types of neural networks is that there is only a single hyper-parameter, namely the sigma. In the previous post (https://statcompute.wordpress.com/2019/07/06/latin-hypercube-sampling-in-hyper-parameter-optimization), I’ve shown how to use the random search strategy to find a close-to-optimal value ...
[Read more...]

Develop Performance Benchmark with GRNN

September 14, 2019 | statcompute

It has been mentioned in https://github.com/statcompute/GRnnet that GRNN is an ideal approach employed to develop performance benchmarks for a variety of risk models. People might wonder what the purpose of performance benchmarks is and why we would even need one at all. Sometimes, a model developer ...
[Read more...]

Dummy Is As Dummy Does

August 24, 2019 | statcompute

In the 1975 edition of “Applied multiple regression/correlation analysis for the behavioral sciences” by Jacob Cohen, an interesting approach of handling missing values in numeric variables was proposed with the purpose to improve the traditional single-value imputation, as described below: – First of all, impute missing values by the value of ... [Read more...]

Improve GRNN by Weighting

July 21, 2019 | statcompute

In the post (https://statcompute.wordpress.com/2019/07/14/yet-another-r-package-for-general-regression-neural-network), several advantages of General Regression Neural Network (GRNN) have been discussed. However, as pointed out by Specht, a major weakness of GRNN is the high computational cost required for a GRNN to generate predicted values based on a new input matrix due ... [Read more...]

Yet Another R Package for General Regression Neural Network

July 14, 2019 | statcompute

Compared with other types of neural networks, General Regression Neural Network (Specht, 1991) is advantageous in several aspects. Being an universal approximation function, GRNN has only one tuning parameter to control the overall generalization The network structure of GRNN is surprisingly simple, with only one hidden layer and the number of ... [Read more...]

Monotonic Binning Driven by Decision Tree

July 8, 2019 | statcompute

After the development of MOB package (https://github.com/statcompute/MonotonicBinning), I was asked by a couple users about the possibility of using the decision tree to drive the monotonic binning. Although I am not aware of any R package implementing the decision tree with the monotonic constraint, I did ... [Read more...]

Chunk Averaging of GLM

July 7, 2019 | statcompute

Chunk Average (CA) is an interesting concept proposed by Matloff in the chapter 13 of his book “Parallel Computing for Data Science”. The basic idea is to partition the entire model estimation sample into chunks and then to estimate a glm for each chunk. Under the i.i.d assumption, the ... [Read more...]

Latin Hypercube Sampling in Hyper-Parameter Optimization

July 6, 2019 | statcompute

In my previous post https://statcompute.wordpress.com/2019/02/03/sobol-sequence-vs-uniform-random-in-hyper-parameter-optimization/, I’ve shown the difference between the uniform pseudo random and the quasi random number generators in the hyper-parameter optimization of machine learning. Latin Hypercube Sampling (LHS) is another interesting way to generate near-random sequences with a very simple idea. Let’...
[Read more...]

Parallel R: Socket or Fork

June 30, 2019 | statcompute

In the R parallel package, there are two implementations of parallelism, e.g. fork and socket, with pros and cons. For the fork, each parallel thread is a complete duplication of the master process with the shared environment, including objects or variables defined prior to the kickoff of parallel threads. ... [Read more...]

WoE Transformation for Loss Given Default Models

May 26, 2019 | statcompute

In the intro section of my MOB package (https://github.com/statcompute/MonotonicBinning#introduction), reasons and benefits of using WoE transformations in the context of logistic regressions with binary outcomes had been discussed. What’s more, the similar idea can be easily generalized to other statistical models in the credit ... [Read more...]

Faster Way to Slice Dataframe by Row

May 12, 2019 | statcompute

When we’d like to slice a dataframe by row, we can employ the split() function or the iter() function in the iterators package. By leveraging the power of parallelism, I wrote an utility function slice() to faster slice the dataframe. In the example shown below, the slice() is 3 times ... [Read more...]

Granular Weighted Binning by Generalized Boosted Model

May 7, 2019 | statcompute

In the post https://statcompute.wordpress.com/2019/04/27/more-general-weighted-binning, I’ve shown how to do the weighted binning with the function wqtl_bin() by the iterative partitioning. However, the outcome from wqtl_bin() sometimes can be too coarse. The function wgbm_bin() (https://github.com/statcompute/MonotonicBinning/blob/master/code/wgbm_...
[Read more...]
1 2 3 8

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)