# Notes on Multivariate Gaussian Quadrature (with R Code)

September 25, 2015
By

(This article was first published on BioStatMatt » R, and kindly contributed to R-bloggers)

Statisticians often need to integrate some function with respect to the multivariate normal (Gaussian) distribution, for example, to compute the standard error of a statistic, or the likelihood function in of a mixed effects model. In many (most?) useful cases, these integrals are intractable, and must be approximated using computational methods. Monte-Carlo integration is one such method; a stochastic method, but its computation can be prohibitively expensive, especially when the integral is computed many times.

Quadrature methods, which are deterministic rather than stochastic, are another set of methods that can be less computationally expensive, especially for lower-dimension integrals. The main idea is to approximate the integral as a weighted summation (the Monte-Carlo method uses an unweighted summation), where the integrand is evaluated on a grid of points selected from the domain of integration. The weights and points are carefully selected to approximate the integral. Gauss-Hermite quadrature is a well-known method for selecting the weights and points for integrals involving the univariate normal distribution. The details of selecting weights and points is complicated, and involves finding the roots of Hermite polynomials (see with Wikipedia link above for details). Fortunately, there already exists some R code (extracted from the ecoreg package; see the hermite and gauss.hermite functions below) that implements this.

There are natural extensions of univariate Gaussian quadrature for integrals involving the multivariate normal distribution. Peter Jäckel has written a great, short, accessible article about this, and some of the figures below look very similar to those in the article. The extension to multivariate integrals is based on the idea of creating an M-dimensional grid of points by expanding the univariate grid of Gauss-Hermite quadrature points, and then rotating, scaling, and translating those points according to the mean vector and variance-covariance matrix of the multivariate normal distribution over which the integral is calculated (see the mgauss.hermite function below, with comments). The weights of the M-variate quadrature points are the product of the corresponding M univariate weights. The following code block lists three functions, where the first two compute the Gauss-Hermite quadrature weights and points in one dimension, and the last computes the weights and points for multivariate Gaussian quadrature.

## perform quadrature of multivariate normal

## compute Gauss-Hermite quadrature points and weights
## for a one-dimensional integral.
## points -- number of points
## interlim -- maximum number of Newton-Raphson iterations

hermite <- function (points, z) {
p1 <- 1/pi^0.4
p2 <- 0
for (j in 1:points) {
p3 <- p2
p2 <- p1
p1 <- z * sqrt(2/j) * p2 - sqrt((j - 1)/j) * p3
}
pp <- sqrt(2 * points) * p2
c(p1, pp)
}

gauss.hermite <- function (points, iterlim = 50) {
x <- w <- rep(0, points)
m <- (points + 1)/2
for (i in 1:m) {
z <- if (i == 1)
sqrt(2 * points + 1) - 2 * (2 * points + 1)^(-1/6)
else if (i == 2)
z - sqrt(points)/z
else if (i == 3 || i == 4)
1.9 * z - 0.9 * x[i - 2]
else 2 * z - x[i - 2]
for (j in 1:iterlim) {
z1 <- z
p <- hermite(points, z)
z <- z1 - p[1]/p[2]
if (abs(z - z1) <= 1e-15)
break
}
if (j == iterlim)
warning("iteration limit exceeded")
x[points + 1 - i] <- -(x[i] <- z)
w[i] <- w[points + 1 - i] <- 2/p[2]^2
}
r <- cbind(x * sqrt(2), w/sum(w))
colnames(r) <- c("Points", "Weights")
r
}

## compute multivariate Gaussian quadrature points
## n     - number of points each dimension before pruning
## mu    - mean vector
## sigma - covariance matrix
## prune - NULL - no pruning; [0-1] - fraction to prune
mgauss.hermite <- function(n, mu, sigma, prune=NULL) {
if(!all(dim(sigma) == length(mu)))
stop("mu and sigma have nonconformable dimensions")

dm  <- length(mu)
gh  <- gauss.hermite(n)
#idx grows exponentially in n and dm
idx <- as.matrix(expand.grid(rep(list(1:n),dm)))
pts <- matrix(gh[idx,1],nrow(idx),dm)
wts <- apply(matrix(gh[idx,2],nrow(idx),dm), 1, prod)

## prune
if(!is.null(prune)) {
qwt <- quantile(wts, probs=prune)
pts <- pts[wts > qwt,]
wts <- wts[wts > qwt]
}

## rotate, scale, translate points
eig <- eigen(sigma)
rot <- eig$vectors %*% diag(sqrt(eig$values))
pts <- t(rot %*% t(pts) + mu)
return(list(points=pts, weights=wts))
}

For some of the M-variate points, the weights are very small, and thus contribute very little to the integral. The notion of ‘pruning’ can be used to eliminate those points with very small weights. The mgauss.hermite function does this by trimming a specified fraction of the smallest weights (I’ve found that pruning 20% works well). In two dimensions, when the variance of each variable is 1.0 and correlation 0.5, the pruned points look as follows, where the point diameter is monotonic in the corresponding weight:

sig <- matrix(c(1,0.5,0.5,1),2,2)
pts <- mgauss.hermite(10, mu=c(0,0), sigma=sig, prune=0.2)
plot(pts$points, cex=-5/log(pts$weights), pch=19,
xlab=expression(x[1]),
ylab=expression(x[2]))

Computing a 2D integral with these points would require 80 evaluations of the integrand (note that there were originally 10 points in each dimension, or 100 points total, but by pruning were reduced to 80). Now, the real question is whether integrating with such points and weights can achieve a similar or better result than a same-sized (or perhaps even much larger) Monte-Carlo method. The following three sections compare these methods (and additionally the delta method, in the last section) in computing means and variances, probabilities, and the standard error of an unusual statistic. Probabilities are an interesting case because of their discreteness, and computing standard errors is, obviously, an important application of quadrature.

### Computing Means and Variance-Covariances

The true mean vector is zero, and the true variances and covariance are one and one-half, respectively. The quadrature method is the winner here:

library(mvtnorm); set.seed(42)
x80   <- rmvnorm(80, sigma=sig)
x1000 <- rmvnorm(1000, sigma=sig)

### Means
## quadrature with 80 points
colSums(pts$points * pts$weights)
## [1] -6.140989e-21  1.291725e-20

## Monte-Carlo with 80 points
colMeans(x80)
## [1] -0.06886731 -0.03477292

## Monte-Carlo with 1000 points
colMeans(x1000)
## [1] -0.02371047 -0.01133503

### Variances
## quadrature with 80 points
cov.wt(pts$points, wt=pts$weights, method="ML")$cov ## [,1] [,2] ## [1,] 0.9999904 0.4999952 ## [2,] 0.4999952 0.9999904 ## Monte-Carlo with 80 points cov(x80) ## [,1] [,2] ## [1,] 0.9838169 0.4958186 ## [2,] 0.4958186 1.0174029 ## Monte-Carlo with 1000 points cov(x1000) ## [,1] [,2] ## [1,] 1.0083872 0.4938198 ## [2,] 0.4938198 0.9727271 ### Computing Probabilities Computing probabilities is the same as using an indicator functions as the integrand in this context, which are obviously much more discrete than the integrand for means. It looks like the Monte-Carlo methods may be superior in computing such quantities. For the first probability, the true value is 1/3; for the second, the true value is 1/20: ### Probabilities ## P(x1<0, x2<0) = 1/3 ## quadrature with 80 points gfun <- function(x) prod(x<0) sum(apply(pts$points, 1, gfun) * pts$weights) ## [1] 0.3927074 ## Monte-Carlo with 80 points mean(apply(x80, 1, gfun)) ## [1] 0.3625 ## Monte-Carlo with 1000 points mean(apply(x1000, 1, gfun)) ## [1] 0.336 ## P(x1    Computing Standard Errors for Unusual Statistics Consider a model described by vector of parameters, and an estimator that has an approximate multivariate normal distribution. This is often the case, for example, with ordinary least-squares and maximum likelihood estimators. As an example, consider the one-compartment pharmacokinetic model with first-order elimination and intravenous bolus injection. The cfun function below gives the concentration of drug as a function of time following an IV bolus. The tfun function computes the time at which the concentration reaches a given level. As our statistic of interest, consider the amount of time in which the concentration remains above 0.064 g/L, which is four-fold the minimum inhibitory concentration (MIC) of piperacillin, an antibiotic. The figure below illustrates the time-course of piperacillin concentration for a typical patient after a 3g IV bolus. ## drug concentration after bolus injection ## one-compartment model cfun <- function(t,v,k,dose=3) dose/v*exp(-k*t) ## compute time at which concentration reaches x tfun <- function(x,v,k,dose=3) -log(x*v/dose)/k ## compute time at which concentration reaches 0.064 ## given parameters p=c(v,k) gfun <- function(p) tfun(0.064, p[1], p[2]) ## from a recent PK study on piperacillin log_mu <- c(log_v=3.444, log_k=-2.036) log_sig <- structure(c(0.0033, -0.0022, -0.0022, 0.0034), .Dimnames = rep(list(c("log_v", "log_k")),2), .Dim = c(2L, 2L)) curve(cfun(x, v=exp(log_mu[1]), k=exp(log_mu[2])), from=0, to=8, n=300, xlab="Time (h)", ylab="Concentration (g/L)") lines(x=c(-10,rep(gfun(exp(log_mu)),2)), y=c(rep(0.064,2), 0), lty=2) legend("topright", bty="n", legend=paste0("Time to 0.064 g/L: ", round(gfun(exp(log_mu)),1), "h")) Since the amount of time in which the concentration remains above 0.064 g/L is a function of the model parameters, sampling variability in the parameter estimates propagate, and thus we can compute a standard error. The “true” standard error is approximately 0.1233, which was computed (ironically) using the Monte Carlo method with 10M sample points. Below, the standard error is approximated using quadrature with 80 points, Monte Carlo with 80 and 1000 points, and the delta method. Each of the methods perform fairly well here. However, this is a fairly ‘smooth’ statistic. In fact, my motivation for this little experiment is to lay the groundwork to examine a more complex statistic: the probabilities of pharmacokinetic target attainment in a population. These statistics are usually calculated using a Monte Carlo method (“Monte Carlo Simulation” or MCS, in the antibiotic literature), and are thus somewhat discrete. That is, even for large MCSs, the numerical delta method (i.e., where the gradient is computed numerically) can fail miserably. ## quadrature with 80 points pts <- mgauss.hermite(n=10, mu=log_mu, sigma=log_sig, prune=0.2) cov.wt(matrix(apply(exp(pts$points), 1, gfun), nrow(pts$points),1), pts$weights, method="ML")\$cov ## [,1] ## [1,] 0.1232139 ## Monte-Carlo with 80 points var(apply(exp(rmvnorm(80, mean=log_mu, sigma=log_sig)), 1, gfun)) ## [1] 0.123917 ## Monte-Carlo with 1000 points var(apply(exp(rmvnorm(1e3, mean=log_mu, sigma=log_sig)), 1, gfun)) ## [1] 0.1219701 ## delta method rho <- list2env(list(log_v=log_mu[1],log_k=log_mu[2],x=0.064)) nd <- attr(numericDeriv(quote(tfun(x,exp(log_v),exp(log_k))), theta=c('log_v','log_k'),rho=rho), 'gradient') nd %*% log_sig %*% t(nd) ## [,1] ## [1,] 0.121936 (function () { var script = document.createElement("script"); script.type = "text/javascript"; script.src = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"; document.getElementsByTagName("head")[0].appendChild(script); })(); var vglnk = { key: '949efb41171ac6ec1bf7f206d57e90b8' }; (function(d, t) { var s = d.createElement(t); s.type = 'text/javascript'; s.async = true; s.src = '//cdn.viglink.com/api/vglnk.js'; var r = d.getElementsByTagName(t)[0]; r.parentNode.insertBefore(s, r); }(document, 'script')); Related ShareTweet To leave a comment for the author, please follow the link and comment on their blog: BioStatMatt » R. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more... If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook... 
 
 Comments are closed. 
 Search R-bloggers (function() { var cx = '005359090438081006639:paz69t-s8ua'; var gcse = document.createElement('script'); gcse.type = 'text/javascript'; gcse.async = true; gcse.src = 'https://cse.google.com/cse.js?cx=' + cx; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(gcse, s); })(); Recent popular posts future.apply - Parallelize Any Base R Apply Function Let R/Python send messages when the algorithms are done training A primer in using Java from R - part 1 Forecasting my weight with R Why R 2018 Winners Most visited articles of the week How to write the first for loop in R Installing R packages Using apply, sapply, lapply in R R – Sorting a data frame by the contents of a column How to perform a Logistic Regression in R How to Make a Histogram with Basic R Tutorials for learning R How to Make a Histogram with ggplot2 Creating Slopegraphs with R Sponsors // https://support.cloudflare.com/hc/en-us/articles/200169436-How-can-I-have-Rocket-Loader-ignore-my-script-s-in-Automatic-Mode- // this must be placed higher. Otherwise it doesn't work. // data-cfasync="false" is for making sure cloudflares' rocketcache doesn't interfeare with this // in this case it only works because it was used at the original script in the text widget function createCookie(name,value,days) { var expires = ""; if (days) { var date = new Date(); date.setTime(date.getTime() + (days*24*60*60*1000)); expires = "; expires=" + date.toUTCString(); } document.cookie = name + "=" + value + expires + "; path=/"; } function readCookie(name) { var nameEQ = name + "="; var ca = document.cookie.split(';'); for(var i=0;i < ca.length;i++) { var c = ca[i]; while (c.charAt(0)==' ') c = c.substring(1,c.length); if (c.indexOf(nameEQ) == 0) return c.substring(nameEQ.length,c.length); } return null; } function eraseCookie(name) { createCookie(name,"",-1); } function readTextFile(file) { // Helps people browse between pages without the need to keep downloading the same // ads txt page everytime. This way, it allows them to use their browser's cache. var random_number = readCookie("ad_random_number_cookie"); if(random_number == null) { var random_number = Math.floor(Math.random()*100*(new Date().getTime()/10000000000)); createCookie("ad_random_number_cookie",random_number,1) } file += '?t='+random_number; var rawFile = new XMLHttpRequest(); rawFile.onreadystatechange = function () { if(rawFile.readyState === 4) { if(rawFile.status === 200 || rawFile.status == 0) { // var allText = rawFile.responseText; // document.write(allText); document.write(rawFile.responseText); } } } rawFile.open("GET", file, false); rawFile.send(null); } // readTextFile('https://raw.githubusercontent.com/Raynos/file-store/master/temp.txt'); readTextFile("https://www.r-bloggers.com/wp-content/uploads/text-widget_anti-cache.txt"); Jobs for R usersR Developerpostdoc in psychiatry: machine learning in human genomicsLead Quantitative DeveloperResearch Data Analyst @ Arlington, Virginia, U.S.Market Research Analyst: Mobility for RSGData Scientist @ New Delhi, IndiaData Scientist/Programmer @ Milwaukee, Wisconsin, U.S. Full list of contributing R-bloggers 
 R-bloggers was founded by Tal Galili, with gratitude to the R community. Is powered by WordPress using a bavotasan.com design. Copyright © 2018 R-bloggers. All Rights Reserved. Terms and Conditions for this website var snp_f = []; var snp_hostname = new RegExp(location.host); var snp_http = new RegExp("^(http|https)://", "i"); var snp_cookie_prefix = ''; var snp_separate_cookies = false; var snp_ajax_url = 'https://www.r-bloggers.com/wp-admin/admin-ajax.php'; var snp_ajax_nonce = 'c22bda3084'; var snp_ignore_cookies = false; var snp_enable_analytics_events = false; var snp_enable_mobile = false; var snp_use_in_all = false; var snp_excluded_urls = []; snp_excluded_urls.push(''); Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.) Click here to close (This popup will not appear again) .snp-pop-109583 .snp-theme6 { max-width: 700px;} .snp-pop-109583 .snp-theme6 h1 {font-size: 17px;} .snp-pop-109583 .snp-theme6 { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field ::-webkit-input-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field :-moz-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field :-ms-input-placeholder { color: #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field input { border: 1px solid #a0a4a9;} .snp-pop-109583 .snp-theme6 .snp-field { color: #000000;} .snp-pop-109583 .snp-theme6 { background: #f2f2f2;} jQuery(document).ready(function() { }); var CaptchaCallback = function() { jQuery('.g-recaptcha').each(function(index, el) { grecaptcha.render(el, { 'sitekey' : '' }); }); }; (function(){ var corecss = document.createElement('link'); var themecss = document.createElement('link'); var corecssurl = "https://www.r-bloggers.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shCore.css?ver=3.0.9b"; if ( corecss.setAttribute ) { corecss.setAttribute( "rel", "stylesheet" ); corecss.setAttribute( "type", "text/css" ); corecss.setAttribute( "href", corecssurl ); } else { corecss.rel = "stylesheet"; corecss.href = corecssurl; } document.getElementsByTagName("head")[0].insertBefore( corecss, document.getElementById("syntaxhighlighteranchor") ); var themecssurl = "https://www.r-bloggers.com/wp-content/plugins/syntaxhighlighter/syntaxhighlighter3/styles/shThemeDefault.css?ver=3.0.9b"; if ( themecss.setAttribute ) { themecss.setAttribute( "rel", "stylesheet" ); themecss.setAttribute( "type", "text/css" ); themecss.setAttribute( "href", themecssurl ); } else { themecss.rel = "stylesheet"; themecss.href = themecssurl; } //document.getElementById("syntaxhighlighteranchor").appendChild(themecss); document.getElementsByTagName("head")[0].insertBefore( themecss, document.getElementById("syntaxhighlighteranchor") ); })(); SyntaxHighlighter.config.strings.expandSource = '+ expand source'; SyntaxHighlighter.config.strings.help = '?'; SyntaxHighlighter.config.strings.alert = 'SyntaxHighlighter\n\n'; SyntaxHighlighter.config.strings.noBrush = 'Can\'t find brush for: '; SyntaxHighlighter.config.strings.brushNotHtmlScript = 'Brush wasn\'t configured for html-script option: '; SyntaxHighlighter.defaults['pad-line-numbers'] = false; SyntaxHighlighter.defaults['toolbar'] = false; SyntaxHighlighter.all(); _stq = window._stq || []; _stq.push([ 'view', {v:'ext',j:'1:6.2.1',blog:'11524731',post:'105973',tz:'-6',srv:'www.r-bloggers.com'} ]); _stq.push([ 'clickTrackerInit', '11524731', '105973' ]); /* <![CDATA[ */ jQuery(function(){ jQuery("ul.sf-menu").supersubs({ minWidth: 12, maxWidth: 27, extraWidth: 1 }).superfish({ delay: 100, speed: 250 }); }); /* ]]> */