Imagine that one has a data matrix consisting of observations, each with features, as well as a response vector . We want to build a model for using the feature columns in . In ordinary least squares (OLS), one seeks a vector of coefficients such that
In non-negative least squares (NNLS), we seek a vector coefficients such that it minimizes subject to the additional requirement that each element of is non-negative.
There are a number of ways to perform NNLS in R. The first two methods come from Reference 1, while I came up with the third. (I’m not sharing the third way Reference 1 details because it claims that the method is buggy.)
Let’s generate some fake data that we will use for the rest of the post:
set.seed(1) n <- 100; p <- 10 x <- matrix(rnorm(n * p), nrow = n) y <- x %*% matrix(rep(c(1, -1), length.out = p), ncol = 1) + rnorm(n)
Method 1: the
library(nnls) mod1 <- nnls(x, y) mod1$x #  0.9073423 0.0000000 1.2971069 0.0000000 0.9708051 #  0.0000000 1.2002310 0.0000000 0.3947028 0.0000000
Method 2: the
glmnet() function solves the minimization problem
where and are hyperparameters the user chooses. By setting (the default) and ,
glmnet() ends up solving the OLS problem. By setting
lower.limits = 0, this forces the coefficients to be non-negative. We should also set
intercept = FALSE so that we don’t have an extraneous intercept term.
library(glmnet) mod2 <- glmnet(x, y, lambda = 0, lower.limits = 0, intercept = FALSE) coef(mod2) # 11 x 1 sparse Matrix of class "dgCMatrix" # s0 # (Intercept) . # V1 0.9073427 # V2 . # V3 1.2971070 # V4 . # V5 0.9708049 # V6 . # V7 1.2002310 # V8 . # V9 0.3947028 # V10 .
Method 3: the
NNLS is a special case of bounded-variable least squares (BVLS), where instead of having constraints for each , one has constraints for each . BVLS is implemented in the
bvls() function of the
library(bvls) mod3 <- bvls(x, y, bl = rep(0, p), bu = rep(Inf, p)) mod3$x #  0.9073423 0.0000000 1.2971069 0.0000000 0.9708051 #  0.0000000 1.2002310 0.0000000 0.3947028 0.0000000
In the above,
bl contains the lower limits for the coefficients while
bu contains the upper limits for the coefficients.
- Things I Thought At One Point. Three ways to do non-negative least squares in R.