Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The MathWorks has an interesting demo on how the shape of a circus tent can be modeled as the solution of a quadratic program in MATLAB. In this post, we’ll show how to solve this same problem in R using the quadprog package and also provide the technical details not covered in the Mathwork’s example. In particular, we’ll explain how an energy function is associated to this problem and how descretizing this energy function gives rise to a sparse quadratic program.

Quadratic programming is a powerful and versatile technique that has found applications in a diverse range of fields. Perhaps its most famous application is in portfolio allocation theory, though it also is very well known as the computational foundation of SVM. A quadratic program is a constrained optimization problem where the objective function is a quadratic form (multidimensional generalization of a quadratic function). The constraints in a standard quadratic program are a mix of linear equality and inequality constraints. This example provides both an intuitive introduction to quadratic programming in R and a great test case for benchmarking optimization algorithms.

### Problem Description

We imagine draping a piece of fabric over an ensemble of 5 tent poles, placed symmetrically over a 36 by 36 grid. The four outer poles are all the same size. The center pole is somewhat larger and taller. In R, it’s easy to build this ensemble by exploiting symmetry:
# Build the pole grid using basic symmetries.
q2 = matrix(0,18,18)
q2[8:9,8:9] = .3
q2[17:18,17:18] = 1/2
q1 <- q2[,18:1]
top  <- cbind(q2,q1)
z <- rbind(top,top[18:1,])


Let's plot the pole ensemble and the piece of fabric:

# Plot
library(rgl)
x <- (1:36)/36
y <- (1:36)/36
open3d()
rgl.surface(x,y, z, color='blue',alpha=.75, smooth=TRUE)
rgl.surface(x,y, matrix(1/2,36,36), color='red', front='lines', back='lines')
rgl.bg(color="white")

The task is to model the height $u(x,y)$ of the piece of fabric over the grid point $(x,y)$.

### Defining an Energy Function

We now need to specify an energy function which defines what should happen when you drape the fabric over the pole ensemble. The idea is that once the fabric is placed, it eventually settles to some minimal energy state. Let $u(x,y)$ be any function that gives heights of the fabric over any grid point $(x,y)$. Suppose we come up with a way to measure the energy $E$ of any fabric distribution $u$. We call this measurement $E[u]$. A minimal energy state is a particular fabric distribution $u^*$ that makes our measurement $E[u^*]$ as small as possible.

The question of how to measure the energy $E[u]$ is one for physicists and engineers. Without specifying the mechanical laws governing the process by which the fabric "settles", we shouldn't expect to be able to provide realistic solutions to this model. Nevertheless, let's consider the famous energy measurement: $$E[u] = \frac{1}{2}\int_0^{l_y} \int_0^{l_x} \nabla u(x,y) \cdot \nabla u(x,y) \;dx dy$$ where $l_y$ and $l_x$ are the length of the vertical and horizontal sides of the grid and $\nabla$ is the gradient operator. This measurement is called the Dirichlet energy of $u$ and, roughly, it captures the total amount of variation in $u$.

We would like to find a numerical approximation to the fabric distribution $u(x,y)$ that makes the energy $E[u]$ as small as possible and also fits on top of our tent peg ensemble. Let's take for granted that such an optimal fabric distribution exists and that this function has no tears or creases (continuous differentiability). We can then use integration by parts to rewrite the Dirichlet energy as: $$\begin{array}{l} E[u] &= \frac{1}{2} \int_0^{l_y} \int_0^{l_x} \nabla u(x,y) \cdot \nabla u(x,y) \; dx \;dy \\ &= - \frac{1}{2} \int_0^{l_y} \int_0^{l_x} u(x,y) \Delta u(x,y) \; dx dy + \frac{1}{2} \oint_{\square} u(x,y) \nabla u(x,y) \cdot \nu(x,y) \; dS. \end{array}$$ The second integral in the last expression is a surface integral around the rectangular boundary of our grid. If we assume that $u(x,y)=0$ at the edge of the grid then this term vanishes and we're left with the energy expression: $$\begin{array}{l} E[u] &= - \frac{1}{2} \int_0^{l_y} \int_0^{l_x} u(x,y) \Delta u(x,y) \; dx dy. \end{array}$$

### Discretizing the Energy Function

Now, let's show how to use the previous formula to get a numerical approximation to the function $u(x,y)$ that minimizes the energy $E[u]$. To do this, we choose a uniform discretization of our grid. Let $h_x$ and $h_y$ be the spacing between grid points in the $x$ and $y$ directions. We can then approximate the Laplacian $\Delta$ using finite difference formulas: $$\Delta u(x,y) = \frac{ u(x+h_x,y) - 2u(x,y) + u(x-h_x,y) }{h_x^2} + \frac{u(x,y +h_y) - 2u(x,y) + u(x,y-h_y)}{h_y^2}.$$ Now we discretize the energy function $E[u]$: $$\begin{array}{l} E[u] &= -\frac{1}{2} \int_0^{l_y} \int_0^{l_x} \nabla u(x,y) \cdot \nabla u(x,y) \; dx \;dy \\ & \approx - \frac{h_x h_y}{2} \sum_{i=1}^{N_y} \sum_{j=1}^{N_y} u(ih_x, jh_y) \Delta u(i h_x, j h_y). \end{array}$$ In our discrete problem we are trying to find the values of $u$ at each of our $N_x \cdot N_y$ grid points. In the current formulation, these points are arranged as an $N_y \times N_x$ matrix but it turns out to be much more convenient to reorganize these points into a single long vector of $N_x \cdot N_y$ components. We'll do this by stacking the columns of the grid matrix into a single column so that the grid point $(i,j)$ will map to component $k= N_y(j-1) + i$. In R, we can stack the columns of the matrix M in this way by
nr <- nrow(z)
nc <- ncol(z)
N  <- nr*nc
bvec <- matrix(z,N,1,byrow=FALSE)


We write $u_k$ to denote the $k$-th entry of the solution vector $\vec{u}$ in this new indexing system. Then the energy approximation becomes: $$\begin{array}{l} E[u] & \approx \frac{h_x h_y}{2} \sum_{k=1}^{N_x \cdot N_y} u_k(- \Delta u_k). \end{array}$$ Now, we observe that the approximation of $-\Delta u_k$ involves only five components of the $\vec{u}$ vector. In fact: $$-\Delta u_k \approx -\frac{u_{k+N_y}- 2u_k + u_{k - N_y}}{h_x^2} - \frac{u_{k+1} -2u_k + u_{k-1} }{h_y^2}.$$ Using this observation, we write the energy approximation as $$\begin{array}{l} E[u] &\approx \frac{h_x h_y}{2} \vec{u}^T L\vec{u}. \end{array}$$ where $L$ is a special matrix called the discrete Laplacian. The discrete Laplacian is a famous sparse matrix which has five non-zero bands (one for each component that contributes to a typical $\Delta u_k$ value). It is a numerical approximation to the Laplace operator $-\Delta$ and has many remarkable properties. We can use Kronecker products to easily build the 2D discrete Laplacian on a rectangular grid in R:
# 2D Discrete Laplacian on an lx by ly rectangular grid with
# nx grid lines in the x direction and ny grid lines in the y direction.
Laplacian2D <- function(lx,ly, nx,ny){
hx <- lx/(nx-1)
hy <- ly/(ny-1)
tr_x <- c(2,-1,rep(0,nx-2))
tr_y <- c(2,-1,rep(0,ny-2))
Tx <- toeplitz(tr_x)/(hx^2)
Ty <- toeplitz(tr_y)/(hy^2)
Ix <- diag(nx)
Iy <- diag(ny)
L <- kronecker(Tx,Iy) + kronecker(Ix,Ty)
return(L)
}


To recap, our goal was to find a numerical approximation to the function $u$ which minimizes $E[u]$ and which fits on our tent peg ensemble. Our work in the previous section showed that we can find such an approximation by solving the following optimization problem: $$\left\{ \begin{array}{l} \min_{\vec{u} \in \mathbb{R}^{Ny \cdot Nx}} \; \frac{1}{2} \vec{u}^T L \vec{u} \\ \mbox{subject to:} \qquad \vec{u} \geq \vec{z} \end{array} \right.$$ where $\vec{z}$ is the vector whose component $z_k$ give the heights of the ground and tent pole ensemble at the (vectorized) grid point $k$. This is a quadratic programming problem, and a very tractable one since an important property of the matrix $L$ is that it is symmetric positive definite.

Note that the Mathworks example uses a slightly more general quadratic form $$E[u] \approx \frac{1}{2} \vec{u}^TL \vec{u} + \vec{c}^T \vec{u}$$ where $\vec{c}$ is a constant vector whose components are all equal to $h_x \cdot h_y$. Though the energy function is not explained in their example, it is likely that this linear term is meant to capture some material effects that aren't accounted for by the simple Dirichlet energy measure. We'll also include this term in our solution below to make use of a slightly more interesting quadratic form.

We'll solve our quadratic program with R's quadprog library. The solution algorithm implemented in R's quadprog is somewhat different from that available in MATLAB. One important note about R's quadprog library is that it does not take advantage of the sparsity of the system matrix $L$ nor of the sparsity of the constraints $\vec{u} \geq \vec{z}$. Since an $N \times N$ matrix requires an $N^2 \times N^2$ matrix $L$, the dense matrix representation of $L$ can quickly become impractical. For the $36 \times 36$ grid of this problem, however, quadprog works wonderfully:
# Solve the QP:
Ny  <- nrow(z)
Nx  <- ncol(z)
N   <- Nx*Ny
hx  <- 1/(Nx-1)
hy  <- 1/(Ny-1)

# System matrices
Dmat <- hx*hy*Laplacian2D(1,1,36,36)
dvec <- -theta*(hx*hy)*rep(1,N)
Amat <- t(diag(N))
bvec <- matrix(z,N,1,byrow=FALSE)

# call the solver
sol  <- solve.QP(Dmat, dvec, Amat, bvec)

# extract and reshape the solution
tent <- matrix(sol\$solution,nrow(z),ncol(z), byrow=FALSE)


Now we plot the solution:

#plot
open3d()
x <- (1:Nx)/Nx
y <- (1:Ny)/Ny
rgl.surface(x, y , z, color='blue',alpha=.75, smooth=TRUE)
rgl.surface(x, y , tent, color='red', front='lines', back='lines')
rgl.bg(color="white")