[This article was first published on a modeler's tribulations, gopi goteti's web log, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Problem

When a correlation or covariance matrix is not positive definite (i.e., in instances when some or all eigenvalues are negative), a cholesky decomposition cannot be performed. Sometimes, these eigenvalues are very small negative numbers and occur due to rounding or due to noise in the data. In simulation studies a known/given correlation has to be imposed on an input dataset. In such cases one has to deal with the issue of making a correlation matrix positive definite. Following are papers in the field of stochastic precipitation where such matrices are used.

• FP Brissette, M Khalili, R Leconte, Journal of Hydrology, 2007, “Efficient stochastic generation of multi-site synthetic precipitation data”
• GA Baigorria, JW Jones, Journal of Climate, 2010, “GiST: A stochastic model for generating spatially and temporally correlated daily rainfall data”
• M Mhanna and W Bauwens, International Journal of Climatology, 2012, “A stochastic space-time model for the generation of daily rainfall in the Gaza Strip”

# A Solution

The paper by Rebonato and Jackel, “The most general methodology for creating a valid correlation matrix for risk management and option pricing purposes”, Journal of Risk, Vol 2, No 2, 2000, presents a methodology to create a positive definite matrix out of a non-positive definite matrix.

Below is my attempt to reproduce the example from Rebonato and Jackel (2000). The correlation matrix below is from the example.

```origMat <- array(c(1, 0.9, 0.7, 0.9, 1, 0.3, 0.7, 0.3, 1), dim = c(3,
3))

origEig <- eigen(origMat)
origEig\$values```

`##   2.296728  0.710625 -0.007352`

As you can see, the third eigenvalue is negative. Trying a cholesky decomposition on this matrix fails, as expected.

```cholStatus <- try(u <- chol(origMat), silent = FALSE)
cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)```

Here, I use the method of Rebonato and Jackel (2000), as elaborated by Brissette et al. (2007), to fix the correlation matrix. As per the method, replace the negative eigenvalues with 0 (or a small positive number as Brissette et al. 2007 suggest), then normalize the new vector.

```# fix the correl matrix
newMat <- origMat

iter <- 0
while (cholError) {

iter <- iter + 1
cat("iteration ", iter, "\n")

# replace -ve eigen values with small +ve number
newEig <- eigen(newMat)
newEig2 <- ifelse(newEig\$values < 0, 0, newEig\$values)

# create modified matrix eqn 5 from Brissette et al 2007, inv = transp for
# eig vectors
newMat <- newEig\$vectors %*% diag(newEig2) %*% t(newEig\$vectors)

# normalize modified matrix eqn 6 from Brissette et al 2007
newMat <- newMat/sqrt(diag(newMat) %*% t(diag(newMat)))

# try chol again
cholStatus <- try(u <- chol(newMat), silent = TRUE)
cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)
}```

```## iteration  1
## iteration  2```

```# final check
eigen(newMat)\$values```

`##   2.290e+00  7.096e-01 -1.332e-15`

# Unresolved Issue(?)

However, as you can see, the third eigenvalue is still negative (but very close to zero). The “chol” function in R is not giving an error probably because this negative eigenvalue is within the “tolerance limits”. I would like to know what these “tolerance limits” are.