Today I’m extraordinarily pleased because today I solved an actuall real world problem using R. Sure, I’ve solved many esoteric statistical problems with R, but I’m not sure if any of those solutions have escaped the digital world and made some impact ex silico.
It is now summer and in Sweden that means that many people tend to overhaul and rebuild their wooden decks as you need somewhere to sit during those precious few weeks of +20°C (70° F) weather. And so, we also decided to rebuild our algae ridden, half-rotten deck and everything went well until we got to the point where we had to construct the last steps leading into the house. As we had been slightly sloppy when buying planks we only had five left, and when naïvely measuring out the lengths we needed it seemed that the planks were not long enough. Now the problem was this: Was there some way we could saw the planks into the lengths we needed or did we have to go all the way to the lumber yard to get more planks?
These were the planks we had (in centimeters):
planks_we_have <- c(120, 137, 220, 420, 480)
And these were the planks lengths we wanted (again in cm):
planks_we_want <- c(19, 19, 19, 19, 79, 79, 79, 103, 103, 103, 135, 135, 135, 135, 160)
If you just naïvely saw the smallest planks into the smallest plank lengths you’ll end up sawing the following:
120 -> 19, 19, 19, 19 137 -> 79 220 -> 79, 79 420 -> 103, 103, 103 480 -> 135, 135, 135
But using this “algorithm” we end up lacking material for the 135 cm and the 160 cm plank! However, it could be the case that if we just saw the plank lengths in a smarter way the planks we have would suffice. At this point I could have exclaimed “Ah, but isn’t this problem just a special case of the multiple knapsack problem?! Finally my course in Algorithm theory will pay off!” (but I really didn’t).
The knapsack problem is a famous problem in computer science where the objective is to find the combination of items that has the highest total value under the constraint that their total weight is less than a given weight. My mental image of this problem is that of a thief trying to stuff a knapsack with the most valuable goods under the constraint that s/he can only carry a certain weight. The multiple knapsack problem is just the generalization where the are more than one knapsack. In our case each plank can be seen as a “knapsack” where each plank length is an “item”” we can allocate to a plank. If we set the value of each plank length to its actual length (so the 79 cm plank length is worth 79), and solve the multiple knapsack problem, we’ll end up getting the most sawed plank pieces possible given the material we have!
Now, we could code up a brute force solution in R but, as we wanted the deck built sooner than later, I first did a search on CRAN and was happy to find the adagio package which contains “Discrete and Global Optimization Routines”. Among these we find the
mknapsackfunction which given vectors of item values, item weights and knapsack capacities solves the multiple knapsack problem. It returns a list containing the vector
ksackthat indicates what knapsack each item should go into. Here is now the R code to solve our plank sawing problem:
library(adagio) # mknapsack calling signature is: mknapsack(values, weights, capacities) solution <- mknapsack(planks_we_want, planks_we_want + 1, planks_we_have) # Above I added +1 cm to each length to compensate for the loss when sawing. solution$ksack
##  1 4 4 1 1 3 4 5 5 5 5 4 2 3 4
# That is, cut plank length 1 from plank 1, plank length 2 from plank 4, etc. # Now pretty printing what to cut so that we don't make mistakes... assignment <- data.frame( cut_this = planks_we_have[solution$ksack], into_this = planks_we_want) t(assignment[order(assignment[,1]), ])
## 1 4 5 13 6 14 2 3 7 12 15 8 9 10 11 ## cut_this 120 120 120 137 220 220 420 420 420 420 420 480 480 480 480 ## into_this 19 19 79 135 79 135 19 19 79 135 160 103 103 103 135
Tada! Turns out our existing planks were long enough to saw into the pieces we wanted after all. I guess we could have figured this out by ourselves, but thanks to the adagio package the R solution was pretty painless. Here are the planks post saw:
And here is the final result:
So, I guess it was a good thing that I studied that algorithms course and that I took a break from C++ to learn R. Now I’m just waiting for the R-package that does all the sawing and hammering for you…