# A deep dive into glmnet: penalty.factor

**R – Statistical Odds & Ends**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The `glmnet`

function (from the package of the same name) is probably the most used function for fitting the elastic net model in R. (It also fits the lasso and ridge regression, since they are special cases of elastic net.) The `glmnet`

function is very powerful and has several function options that users may not know about. In a series of posts, I hope to shed some light on what these options do.

Here is the full signature of the `glmnet`

function:

glmnet(x, y, family=c("gaussian","binomial","poisson","multinomial","cox","mgaussian"), weights, offset=NULL, alpha = 1, nlambda = 100, lambda.min.ratio = ifelse(nobs<nvars,0.01,0.0001), lambda=NULL, standardize = TRUE, intercept=TRUE, thresh = 1e-07, dfmax = nvars + 1, pmax = min(dfmax * 2+20, nvars), exclude, penalty.factor = rep(1, nvars), lower.limits=-Inf, upper.limits=Inf, maxit=100000, type.gaussian=ifelse(nvars<500,"covariance","naive"), type.logistic=c("Newton","modified.Newton"), standardize.response=FALSE, type.multinomial=c("ungrouped","grouped"))

In this post, we will focus on the **penalty.factor** option.

Unless otherwise stated, will denote the number of observations, will denote the number of features, and `fit`

will denote the output/result of the `glmnet`

call.

**penalty.factor**

When this option is not set, for each value of in `lambda`

, `glmnet`

is minimizing the following objective function:

When the option is set to a vector `c(c_1, ..., c_p)`

, `glmnet`

minimizes the following objective instead:

In the documentation, it is stated that “the penalty factors are internally rescaled to sum to nvars and the lambda sequence will reflect this change.” However, from my own experiments, it seems that the penalty factors are internally rescaled to sum to nvars **but** the lambda sequence remains the same. Let’s generate some data:

n <- 100; p <- 5; true_p <- 2 set.seed(11) X <- matrix(rnorm(n * p), nrow = n) beta <- matrix(c(rep(1, true_p), rep(0, p - true_p)), ncol = 1) y <- X %*% beta + 3 * rnorm(n)

We fit two models, `fit`

which uses the default options for `glmnet`

, and `fit2`

which has `penalty.factor = rep(2, 5)`

:

fit <- glmnet(X, y) fit2 <- glmnet(X, y, penalty.factor = rep(2, 5))

What we find is that these two models have the exact same `lambda`

sequence and produce the same `beta`

coefficients.

sum(fit$lambda != fit2$lambda) # [1] 0 sum(fit$beta != fit2$beta) # [1] 0

The same thing happens when we supply our own `lambda`

sequence:

fit3 <- glmnet(X, y, lambda = c(1, 0.1, 0.01), penalty.factor = rep(10, 5)) fit4 <- glmnet(X, y, lambda = c(1, 0.1, 0.01), penalty.factor = rep(1, 5)) sum(fit3$lambda != fit4$lambda) # [1] 0 sum(fit3$beta != fit4$beta) # [1] 0

Hence, my conclusion is that if `penalty.factor`

is set to `c(c_1, ..., c_p)`

, `glmnet`

is really minimizing

where .

**leave a comment**for the author, please follow the link and comment on their blog:

**R – Statistical Odds & Ends**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.