# Generating a multivariate gaussian distribution using RcppArmadillo

**Rcpp Gallery**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There are many ways to simulate a multivariate gaussian distribution assuming that you can simulate from independent univariate normal distributions. One of the most popular method is based on the Cholesky decomposition. Let’s see how `Rcpp`

and `Armadillo`

perform on this task.

`#include `
// [[Rcpp::depends(RcppArmadillo)]]
using namespace Rcpp;
// [[Rcpp::export]]
arma::mat mvrnormArma(int n, arma::vec mu, arma::mat sigma) {
int ncols = sigma.n_cols;
arma::mat Y = arma::randn(n, ncols);
return arma::repmat(mu, 1, n).t() + Y * arma::chol(sigma);
}

The easiest way to perform a Cholesky distribution in R is to use the `chol`

function in `R`

which interface some fast `LAPACK`

routines.

```
### naive implementation in R
mvrnormR <- function(n, mu, sigma) {
ncols <- ncol(sigma)
mu <- rep(mu, each = n) ## not obliged to use a matrix (recycling)
mu + matrix(rnorm(n * ncols), ncol = ncols) %*% chol(sigma)
}
```

We will also use `MASS:mvrnorm`

which implements it differently:

```
require(MASS)
```

Loading required package: MASS

```
### Covariance matrix and mean vector
sigma <- matrix(c(1, 0.9, -0.3, 0.9, 1, -0.4, -0.3, -0.4, 1), ncol = 3)
mu <- c(10, 5, -3)
require(MASS)
### checking variance
set.seed(123)
cor(mvrnormR(100, mu, sigma))
```

[,1] [,2] [,3] [1,] 1.0000 0.8851 -0.3830 [2,] 0.8851 1.0000 -0.4675 [3,] -0.3830 -0.4675 1.0000

```
cor(MASS::mvrnorm(100, mu, sigma))
```

[,1] [,2] [,3] [1,] 1.0000 0.9106 -0.3016 [2,] 0.9106 1.0000 -0.4599 [3,] -0.3016 -0.4599 1.0000

```
cor(mvrnormArma(100, mu, sigma))
```

[,1] [,2] [,3] [1,] 1.000 0.9020 -0.3530 [2,] 0.902 1.0000 -0.4889 [3,] -0.353 -0.4889 1.0000

```
## checking means
colMeans(mvrnormR(100, mu, sigma))
```

[1] 9.850 4.911 -2.902

```
colMeans(MASS::mvrnorm(100, mu, sigma))
```

[1] 10.051 5.046 -2.914

```
colMeans(mvrnormArma(100, mu, sigma))
```

[1] 9.825 4.854 -2.873

Now, let’s benchmark the different versions:

```
require(rbenchmark)
```

Loading required package: rbenchmark

```
benchmark(mvrnormR(1e4, mu, sigma),
MASS::mvrnorm(1e4, mu, sigma),
mvrnormArma(1e4, mu, sigma),
columns = c('test', 'replications', 'relative', 'elapsed'),
order = 'elapsed')
```

test replications relative elapsed 3 mvrnormArma(10000, mu, sigma) 100 1.000 0.219 1 mvrnormR(10000, mu, sigma) 100 1.913 0.419 2 MASS::mvrnorm(10000, mu, sigma) 100 2.046 0.448

The `RcppArmadillo`

function outperforms the MASS implementation and the naive R code, but more surprisinugly `mvrnormR`

is slightly faster than `mvrnorm`

in this benchmark.

To be fair, while digging into the `MASS::mvrnorm`

code it appears that there are few code sanity checks ( such as the positive definiteness of `Sigma`

).

**leave a comment**for the author, please follow the link and comment on their blog:

**Rcpp Gallery**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.