Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This post is another tour of quadratic programming algorithms and applications in R. First, we look at the quadratic program that lies at the heart of support vector machine (SVM) classification. Then we’ll look at a very different quadratic programming demo problem that models the energy of a circus tent. The key difference between these two problems is that the energy minimization problem has a positive definite system matrix whereas the SVM problem has only a semi-definite one. This distinction has important implications when it comes to choosing a quadratic program solver and we’ll do some solver benchmarking to further illustrate this issue.

# QP and SVM

Let’s consider the following very simple version of the SVM problem. Suppose we have observed $y_i in {-1,1}$, $X_i in mathbb{R}^{m}$ for $n$ (perfectly linearly separable) training cases. We let $y$ denote the $n times 1$ vector of training labels and $X$ the $n times m$ matrix of predictor variables. Our task is to find the hyperplane in $mathbb{R}^m$ which “best separates” our two classes of labels $-1$ and $1$. Visually:
library("e1071")
library("rgl")
train <- iris
train$y <-ifelse(train[,5]=="setosa", 1, -1) sv <- svm(y~Petal.Length+Petal.Width+Sepal.Length, data=train, kernel="linear", scale=FALSE, type="C-classification") W <- rowSums(sapply(1:length(sv$coefs), function(i) sv$coefs[i]*sv$SV[i,]))
plot3d(train$Petal.Length, train$Petal.Width,
train$Sepal.Length, col= ifelse(train$y==-1,"red","blue"),
size = 2, type='s', alpha = .6)