Blog Archives

Monotonic Binning with Smbinning Package

January 22, 2017
By
Monotonic Binning with Smbinning Package

The R package smbinning (http://www.scoringmodeling.com/rpackage/smbinning) provides a very user-friendly interface for the WoE (Weight of Evidence) binning algorithm employed in the scorecard development. However, there are several improvement opportunities in my view: 1. First of all, the underlying algorithm in the smbinning() function utilizes the recursive partitioning, which does not necessarily guarantee the monotonicity. 2.

Read more »

Estimate Regression with (Type-I) Pareto Response

December 11, 2016
By
Estimate Regression with (Type-I) Pareto Response

The Type-I Pareto distribution has a probability function shown as below f(y; a, k) = k * (a ^ k) / (y ^ (k + 1)) In the formulation, the scale parameter 0 < a < y and the shape parameter k > 1 . The positive lower bound of Type-I Pareto distribution is particularly

Read more »

More about Flexible Frequency Models

November 27, 2016
By
More about Flexible Frequency Models

Modeling the frequency is one of the most important aspects in operational risk models. In the previous post (https://statcompute.wordpress.com/2016/05/13/more-flexible-approaches-to-model-frequency), the importance of flexible modeling approaches for both under-dispersion and over-dispersion has been discussed. In addition to the quasi-poisson regression, three flexible frequency modeling techniques, including generalized poisson, double poisson, and Conway-Maxwell poisson, with their implementations

Read more »

Fastest Way to Add New Variables to A Large Data.Frame

October 30, 2016
By
Fastest Way to Add New Variables to A Large Data.Frame

Read more »

Risk Models with Generalized PLS

June 12, 2016
By
Risk Models with Generalized PLS

While developing risk models with hundreds of potential variables, we often run into the situation that risk characteristics or macro-economic indicators are highly correlated, namely multicollinearity. In such cases, we might have to drop variables with high VIFs or employ “variable shrinkage” methods, e.g. lasso or ridge, to suppress variables with colinearity. Feature extraction approaches

Read more »

More Flexible Approaches to Model Frequency

May 12, 2016
By
More Flexible Approaches to Model Frequency

(The post below is motivated by my friend Matt Flynn https://www.linkedin.com/in/matthew-flynn-1b443b11) In the context of operational loss forecast models, the standard Poisson regression is the most popular way to model frequency measures. Conceptually speaking, there is a restrictive assumption for the standard Poisson regression, namely Equi-Dispersion, which requires the equality between the conditional mean and

Read more »

Improve SVM Tuning through Parallelism

March 19, 2016
By
Improve SVM Tuning through Parallelism

As pointed out in the chapter 10 of “The Elements of Statistical Learning”, ANN and SVM (support vector machines) share similar pros and cons, e.g. lack of interpretability and good predictive power. However, in contrast to ANN usually suffering from local minima solutions, SVM is always able to converge globally. In addition, SVM is less

Read more »

Where Bagging Might Work Better Than Boosting

January 2, 2016
By
Where Bagging Might Work Better Than Boosting

In the previous post (https://statcompute.wordpress.com/2016/01/01/the-power-of-decision-stumps), it was shown that the boosting algorithm performs extremely well even with a simple 1-level stump as the base learner and provides a better performance lift than the bagging algorithm does. However, this observation shouldn’t be generalized, which would be demonstrated in the following example. First of all, we developed

Read more »

The Power of Decision Stumps

January 1, 2016
By
The Power of Decision Stumps

A decision stump is the weak classification model with the simple tree structure consisting of one split, which can also be considered a one-level decision tree. Due to its simplicity, the stump often demonstrates a low predictive performance. As shown in the example below, the AUC measure of a stump is even lower than the

Read more »

Prediction Intervals for Poisson Regression

December 20, 2015
By
Prediction Intervals for Poisson Regression

Different from the confidence interval that is to address the uncertainty related to the conditional mean, the prediction interval is to accommodate the additional uncertainty associated with prediction errors. As a result, the prediction interval is always wider than the confidence interval in a regression model. In the context of risk modeling, the prediction interval

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)