Better and enhanced method of estimating Mallow’s Cp

[This article was first published on R-Blog on Data modelling to develop ..., and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Introduction

In statistics, Mallows's Cp, named for Colin Lingwood Mallows, an English statistician, is used to assess the fit of a regression model that has been estimated using ordinary least squares. Models with a Mallows' Cp value near P+1 (i.e. the number of explanatory variables + 1) have a low bias. If every potential model has a high value for Mallows' Cp, this indicates that some important predictor variables are likely missing from each model.

Traditionally, Mallow's Cp has always been estimated from Linear models. In R, there are two packages that does this very well. It is easier to estimates from wle package because it does not require nested models like the olsrr package. In addition, olrss can only estimate Mallow's Cp from linear models. Unfortunately, wle has been archieved by CRAN.

In this blog, I share with you a new method from Dyn4cast package that is capable of estimating the Mallow's Cp from lm, glm and other forms of non-linear models. It is a one line code and easy to use. The usage is as follows:

MallowsCp(Model, y, x, type, Nlevels = 0)

Model is the model estimated

type falls under LM, ALM, GLM, N-LM types of model. N-LM is not LM.

y is vector of the dependent variable data

x is vector of independent variable data

Nlevels is the additional variables created by the model during estimation, defaults to 0 is none is provided.

Load library

library(Dyn4cast)
library(greybox)
library(splines)
binary <- readRDS("data/binary.RDS")
linear <- readRDS("data/linear.RDS")
others <- readRDS("data/others.RDS")

Mallow’s Cp from lm model

Model <- lm(Income ~ ., data = linear)
Type <- "LM"
MallowsCp(Model = Model, y = linear$Income, x = linear[, -1], type = Type, Nlevels = 0)
[1] 5

Mallow’s Cp from ALM model

Model <- alm(Income ~ ., data = linear)
Type <- "ALM"
MallowsCp(Model = Model, y = linear$Income, x = linear[, -1], type = Type, Nlevels = 0)
[1] 5

Mallow’s Cp from GLM model

Model <- glm(GENDER ~ ., data = binary, family = binomial(link = "logit"))
Type <- "GLM"
MallowsCp(Model = Model, y = binary$GENDER, x = binary[, -1], type = Type, Nlevels = 0)
[1] 9

Mallow’s Cp from other models: splines, ARIMA

y <- others$Total
x <- others$Series
Model <- lm(others$Total ~ bs(Series, knots = c(30, 115)), data = others)
Type <- "LM"
MallowsCp(Model = Model, y = y, x = x, type = Type, Nlevels = 0)
[1] 2
# smooth.spline is not a model
Model <- smooth.spline(others$Series, others$Total)
Type <- "LM"
MallowsCp(Model = Model, y = y, x = x, type = Type, Nlevels = 0)
[1] NaN
Model <- forecast::auto.arima(others$Total)
Type <- "LM"
MallowsCp(Model = Model, y = x, x = x, type = Type, Nlevels = 0)
[1] 2
To leave a comment for the author, please follow the link and comment on their blog: R-Blog on Data modelling to develop ....

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)