# Notes on Multivariate Gaussian Quadrature (with R Code)

September 25, 2015
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Statisticians often need to integrate some function with respect to the multivariate normal (Gaussian) distribution, for example, to compute the standard error of a statistic, or the likelihood function in of a mixed effects model. In many (most?) useful cases, these integrals are intractable, and must be approximated using computational methods. Monte-Carlo integration is one such method; a stochastic method, but its computation can be prohibitively expensive, especially when the integral is computed many times.

Quadrature methods, which are deterministic rather than stochastic, are another set of methods that can be less computationally expensive, especially for lower-dimension integrals. The main idea is to approximate the integral as a weighted summation (the Monte-Carlo method uses an unweighted summation), where the integrand is evaluated on a grid of points selected from the domain of integration. The weights and points are carefully selected to approximate the integral. Gauss-Hermite quadrature is a well-known method for selecting the weights and points for integrals involving the univariate normal distribution. The details of selecting weights and points is complicated, and involves finding the roots of Hermite polynomials (see with Wikipedia link above for details). Fortunately, there already exists some R code (extracted from the ecoreg package; see the hermite and gauss.hermite functions below) that implements this.

There are natural extensions of univariate Gaussian quadrature for integrals involving the multivariate normal distribution. Peter Jäckel has written a great, short, accessible article about this, and some of the figures below look very similar to those in the article. The extension to multivariate integrals is based on the idea of creating an M-dimensional grid of points by expanding the univariate grid of Gauss-Hermite quadrature points, and then rotating, scaling, and translating those points according to the mean vector and variance-covariance matrix of the multivariate normal distribution over which the integral is calculated (see the mgauss.hermite function below, with comments). The weights of the M-variate quadrature points are the product of the corresponding M univariate weights. The following code block lists three functions, where the first two compute the Gauss-Hermite quadrature weights and points in one dimension, and the last computes the weights and points for multivariate Gaussian quadrature.

## perform quadrature of multivariate normal

## compute Gauss-Hermite quadrature points and weights
## for a one-dimensional integral.
## points -- number of points
## interlim -- maximum number of Newton-Raphson iterations

hermite <- function (points, z) {
p1 <- 1/pi^0.4
p2 <- 0
for (j in 1:points) {
p3 <- p2
p2 <- p1
p1 <- z * sqrt(2/j) * p2 - sqrt((j - 1)/j) * p3
}
pp <- sqrt(2 * points) * p2
c(p1, pp)
}

gauss.hermite <- function (points, iterlim = 50) {
x <- w <- rep(0, points)
m <- (points + 1)/2
for (i in 1:m) {
z <- if (i == 1)
sqrt(2 * points + 1) - 2 * (2 * points + 1)^(-1/6)
else if (i == 2)
z - sqrt(points)/z
else if (i == 3 || i == 4)
1.9 * z - 0.9 * x[i - 2]
else 2 * z - x[i - 2]
for (j in 1:iterlim) {
z1 <- z
p <- hermite(points, z)
z <- z1 - p/p
if (abs(z - z1) <= 1e-15)
break
}
if (j == iterlim)
warning("iteration limit exceeded")
x[points + 1 - i] <- -(x[i] <- z)
w[i] <- w[points + 1 - i] <- 2/p^2
}
r <- cbind(x * sqrt(2), w/sum(w))
colnames(r) <- c("Points", "Weights")
r
}

## compute multivariate Gaussian quadrature points
## n     - number of points each dimension before pruning
## mu    - mean vector
## sigma - covariance matrix
## prune - NULL - no pruning; [0-1] - fraction to prune
mgauss.hermite <- function(n, mu, sigma, prune=NULL) {
if(!all(dim(sigma) == length(mu)))
stop("mu and sigma have nonconformable dimensions")

dm  <- length(mu)
gh  <- gauss.hermite(n)
#idx grows exponentially in n and dm
idx <- as.matrix(expand.grid(rep(list(1:n),dm)))
pts <- matrix(gh[idx,1],nrow(idx),dm)
wts <- apply(matrix(gh[idx,2],nrow(idx),dm), 1, prod)

## prune
if(!is.null(prune)) {
qwt <- quantile(wts, probs=prune)
pts <- pts[wts > qwt,]
wts <- wts[wts > qwt]
}

## rotate, scale, translate points
eig <- eigen(sigma)
rot <- eig$vectors %*% diag(sqrt(eig$values))
pts <- t(rot %*% t(pts) + mu)
return(list(points=pts, weights=wts))
}

For some of the M-variate points, the weights are very small, and thus contribute very little to the integral. The notion of ‘pruning’ can be used to eliminate those points with very small weights. The mgauss.hermite function does this by trimming a specified fraction of the smallest weights (I’ve found that pruning 20% works well). In two dimensions, when the variance of each variable is 1.0 and correlation 0.5, the pruned points look as follows, where the point diameter is monotonic in the corresponding weight:

sig <- matrix(c(1,0.5,0.5,1),2,2)
pts <- mgauss.hermite(10, mu=c(0,0), sigma=sig, prune=0.2)
plot(pts$points, cex=-5/log(pts$weights), pch=19,
xlab=expression(x),
ylab=expression(x)) Computing a 2D integral with these points would require 80 evaluations of the integrand (note that there were originally 10 points in each dimension, or 100 points total, but by pruning were reduced to 80). Now, the real question is whether integrating with such points and weights can achieve a similar or better result than a same-sized (or perhaps even much larger) Monte-Carlo method. The following three sections compare these methods (and additionally the delta method, in the last section) in computing means and variances, probabilities, and the standard error of an unusual statistic. Probabilities are an interesting case because of their discreteness, and computing standard errors is, obviously, an important application of quadrature.

### Computing Means and Variance-Covariances

The true mean vector is zero, and the true variances and covariance are one and one-half, respectively. The quadrature method is the winner here:

library(mvtnorm); set.seed(42)
x80   <- rmvnorm(80, sigma=sig)
x1000 <- rmvnorm(1000, sigma=sig)

### Means
colSums(pts$points * pts$weights)
##  -6.140989e-21  1.291725e-20

## Monte-Carlo with 80 points
colMeans(x80)
##  -0.06886731 -0.03477292

## Monte-Carlo with 1000 points
colMeans(x1000)
##  -0.02371047 -0.01133503

### Variances