Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
John Mount is a data scientist based in San Francisco, with 20+ years of experience in machine learning, statistics, and analytics. He is the co-founder of the data science consulting firm Win-Vector LLC, and (with Nina Zumel) the co-author of Practical Data Science with R, now in its second edition.
Nina Zumel is a data scientist based in San Francisco, with 20+ years of experience in machine learning, statistics, and analytics. She is the co-founder of the data science consulting firm Win-Vector LLC, and (with John Mount) the co-author of Practical Data Science with R, now in its second edition.
Introduction
Nina Zumel presented the “100 Bushels of Corn” puzzle here as a fun example of using R as a calculator. What if we want R to solve the puzzle for us, instead of merely being a calculator?
Let’s give that a go.
< section id="setting-up-the-problem" class="level2">Setting Up the Problem
100 bushes of corn are distributed to 100 people such that every man receives 3 bushels, every woman 2 bushels, and every child 1/2 a bushel. How many men, women, and children are there?
We can write the 100 Bushels of Corn problem as finding integer vectors x that satisfy a %*% x = b for the following a, b. The first row specifies the constraint on the total number of men, women and children; the second row specifies the constraint on how many bushels each person gets. Notice that we doubled the values of the second equation to keep everything integral.
a <- matrix(c(1, 1, 1, 6, 4, 1),
nrow = 2,
ncol = 3,
byrow = TRUE)
a
[,1] [,2] [,3] [1,] 1 1 1 [2,] 6 4 1
b <- as.matrix(c(100, 200)) b
[,1] [1,] 100 [2,] 200
There are at least two main ways to solve this:
- Brute force. This is kind of the point of computers and programming languages.
- Linear algebra, in particular linear algebra over the ring of integers. This approach shows some of the richness of the R package environment curated at CRAN.
Let’s take a quick look at these two solution styles.
< section id="brute-force-solution" class="level2">Brute Force Solution
A brute force solution is as follows. First, we get a simple upper bound on each variable. We can do this by checking one row of a %*% x = b.
upper_bounds <- floor(b[2] / a[2, ]) upper_bounds
[1] 33 50 200
Now we try all plausible solutions.
for (x1 in 0:upper_bounds[[1]]) {
for (x2 in 0:upper_bounds[[2]]) {
# use a row of a to solve for x3
x3 <- (b[1] - (a[1, 1] * x1 + a[1, 2] * x2)) / a[1, 3]
# check constraints
if ((x3 >= 0) && (abs(x3 %% 1) < 1e-8)) {
x = as.matrix(c(x1, x2, x3))
if (all(a %*% x == b)) {
print(c(x1, x2, x3))
}
}
}
}
[1] 2 30 68 [1] 5 25 70 [1] 8 20 72 [1] 11 15 74 [1] 14 10 76 [1] 17 5 78 [1] 20 0 80
And this gives us exactly the 7 solutions Nina found. This is some of the magic of having a programmable computer: one can cheaply try a lot of potential solutions without needing a lot of theory.
< section id="ring-theory-solution" class="level2">Ring Theory Solution
It turns out there is a systematic way to find all of the integral solutions to a linear system quickly, at least for low dimensional solution spaces. To do this we will use a ring theory or linear algebra matrix factorization called the Hermite normal form. Fortunately, R has a package for this, called numbers. We attach this package as follows.
library(numbers)
This package will find for us a lower diagonal integer matrix h and a square unimodular matrix u such that h = a %*% u. Unimodular matrices map the space of integer vectors Zn to the same space of integer vectors in a 1 to 1, onto, and invertible manner. This means finding integer solution vectors to a %*% x = b is equivalent to the problem of finding integer solutions to h %*% y = b, where x = u %*% y. This second problem is easier, as h is lower diagonal- so solving this system is just a matter of “back filling”.
Let’s first find h, u.
hnf <- hermiteNF(a) h <- hnf$H u <- hnf$U stopifnot(all(h == a %*% u))
# lower triangular transform of a h
[,1] [,2] [,3] [1,] 1 0 0 [2,] 0 1 0
# unimodular transform u
[,1] [,2] [,3] [1,] 7 -1 -3 [2,] -12 2 5 [3,] 6 -1 -2
We now have a %*% u = h. a %*% x = b implies h %*% y = b where x = u %*% y. Let’s solve for a specific solution xs by back substitution.
# back substitute to solve h %*% y = b
# this uses the fact that h is lower-triangular
h_rank <- sum(diag(h) != 0)
stopifnot(h_rank > 1)
y <- numeric(ncol(a))
y[1] <- b[1] / h[1, 1]
for (i in 2:h_rank) {
y[i] <- (b[i] - sum(h[i, 1:(i - 1)] * y[1:(i - 1)])) / h[i, i]
}
stopifnot(all(b == h %*% y))
xs <- u %*% y
stopifnot(all(a %*% xs == b))
xs
[,1] [1,] 500 [2,] -800 [3,] 400
It is a standard result of linear algebra that all solutions of a %*% x = b are of the form x = xs + z where a %*% z = 0 (that is, z is in the null space of h and a). In our case, the null space is spanned by the last column of u.
Let’s show this column.
null_basis <- u[, (h_rank + 1):ncol(u), drop = FALSE] null_basis
[,1] [1,] -3 [2,] 5 [3,] -2
So in our case: all integer solutions of a %*% x = b are of the form [500, -800, 400] + k * [-3, 5, -2] for integer k.
Now we just need to pick k to make everything non-negative (an implicit puzzle condition!). The sign changes of entries of [500, -800, 400] + k * [-3, 5, -2] happen for k where one of the coordinates is equal to zero. These are:
500 - 3 * k == 0-800 + 5 * k == 0, and400 - 2 * k == 0.
So the k of interest are in the range ceiling(max(0, 800/5)) <= k <= floor(min(500/3, 400/2)), or 160 <= k <= 166. This gives us:
for (k in 160:166) {
soln <- xs + k * null_basis
print(c(soln[1, 1], soln[2, 1], soln[3, 1]))
}
[1] 20 0 80 [1] 17 5 78 [1] 14 10 76 [1] 11 15 74 [1] 8 20 72 [1] 5 25 70 [1] 2 30 68
And these are again exactly the solutions Nina Zumel gave in the first 100 Bushels post.
< section id="conclusion" class="level2">Conclusion
R gives the ability to exploit any combination of innate ability and knowledge, borrowed ability, or brute force in solving problems.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
