Blog Archives

Where Bagging Might Work Better Than Boosting

January 2, 2016
By
Where Bagging Might Work Better Than Boosting

In the previous post (https://statcompute.wordpress.com/2016/01/01/the-power-of-decision-stumps), it was shown that the boosting algorithm performs extremely well even with a simple 1-level stump as the base learner and provides a better performance lift than the bagging algorithm does. However, this observation shouldn’t be generalized, which would be demonstrated in the following example. First of all, we developed

Read more »

The Power of Decision Stumps

January 1, 2016
By
The Power of Decision Stumps

A decision stump is the weak classification model with the simple tree structure consisting of one split, which can also be considered a one-level decision tree. Due to its simplicity, the stump often demonstrates a low predictive performance. As shown in the example below, the AUC measure of a stump is even lower than the

Read more »

Prediction Intervals for Poisson Regression

December 20, 2015
By
Prediction Intervals for Poisson Regression

Different from the confidence interval that is to address the uncertainty related to the conditional mean, the prediction interval is to accommodate the additional uncertainty associated with prediction errors. As a result, the prediction interval is always wider than the confidence interval in a regression model. In the context of risk modeling, the prediction interval

Read more »

Calculate Leave-One-Out Prediction for GLM

December 13, 2015
By
Calculate Leave-One-Out Prediction for GLM

In the model development, the “leave-one-out” prediction is a way of cross-validation, calculated as below: 1. First of all, after a model is developed, each observation used in the model development is removed in turn and then the model is refitted with the remaining observations 2. The out-of-sample prediction for the refitted model is calculated

Read more »

Fitting Generalized Regression Neural Network with Python

December 9, 2015
By
Fitting Generalized Regression Neural Network with Python

Read more »

Modeling Frequency in Operational Losses with Python

December 8, 2015
By
Modeling Frequency in Operational Losses with Python

Poisson and Negative Binomial regressions are two popular approaches to model frequency measures in the operational loss and can be implemented in Python with the statsmodels package as below: Although Quasi-Poisson regressions is not currently supported by the statsmodels package, we are still able to estimate the model with the rpy2 package by using R

Read more »

Modeling Severity in Operational Losses with Python

December 6, 2015
By
Modeling Severity in Operational Losses with Python

When modeling severity measurements in the operational loss with Generalized Linear Models, we might have a couple choices based on different distributional assumptions, including Gamma, Inverse Gaussian, and Lognormal. However, based on my observations from the empirical work, the differences in parameter estimates among these three popular candidates are rather immaterial from the practical standpoint.

Read more »

Estimating Quasi-Poisson Regression with GLIMMIX in SAS

October 14, 2015
By
Estimating Quasi-Poisson Regression with GLIMMIX in SAS

When modeling the frequency measure in the operational risk with regressions, most modelers often prefer Poisson or Negative Binomial regressions as best practices in the industry. However, as an alternative approach, Quasi-Poisson regression provides a more flexible model estimation routine with at least two benefits. First of all, Quasi-Poisson regression is able to address both

Read more »

Estimating Quasi-Poisson Regression with GLIMMIX in SAS

October 14, 2015
By
Estimating Quasi-Poisson Regression with GLIMMIX in SAS

When modeling the frequency measure in the operational risk with regressions, most modelers often prefer Poisson or Negative Binomial regressions as best practices in the industry. However, as an alternative approach, Quasi-Poisson regression provides a more flexible model estimation routine with at least two benefits. First of all, Quasi-Poisson regression is able to address both

Read more »

Some Considerations of Modeling Severity in Operational Losses

August 16, 2015
By
Some Considerations of Modeling Severity in Operational Losses

In the Loss Distributional Approach (LDA) for Operational Risk models, multiple distributions, including Log Normal, Gamma, Burr, Pareto, and so on, can be considered candidates for the distribution of severity measures. However, the challenge remains in the stress testing exercise, e.g. CCAR, to relate operational losses to macro-economic scenarios denoted by a set of macro-economic

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)