# Fixing non positive definite correlation matrices using R

(This article was first published on a modeler's tribulations, gopi goteti's web log, and kindly contributed to R-bloggers)

# Problem

When a correlation or covariance matrix is not positive definite (i.e., in instances when some or all eigenvalues are negative), a cholesky decomposition cannot be performed. Sometimes, these eigenvalues are very small negative numbers and occur due to rounding or due to noise in the data. In simulation studies a known/given correlation has to be imposed on an input dataset. In such cases one has to deal with the issue of making a correlation matrix positive definite. Following are papers in the field of stochastic precipitation where such matrices are used.

• FP Brissette, M Khalili, R Leconte, Journal of Hydrology, 2007, “Efficient stochastic generation of multi-site synthetic precipitation data”
• GA Baigorria, JW Jones, Journal of Climate, 2010, “GiST: A stochastic model for generating spatially and temporally correlated daily rainfall data”
• M Mhanna and W Bauwens, International Journal of Climatology, 2012, “A stochastic space-time model for the generation of daily rainfall in the Gaza Strip”

# A Solution

The paper by Rebonato and Jackel, “The most general methodology for creating a valid correlation matrix for risk management and option pricing purposes”, Journal of Risk, Vol 2, No 2, 2000, presents a methodology to create a positive definite matrix out of a non-positive definite matrix.

Below is my attempt to reproduce the example from Rebonato and Jackel (2000). The correlation matrix below is from the example.


origMat <- array(c(1, 0.9, 0.7, 0.9, 1, 0.3, 0.7, 0.3, 1), dim = c(3,
3))

origEig <- eigen(origMat)
origEig$values  ## [1] 2.296728 0.710625 -0.007352  As you can see, the third eigenvalue is negative. Trying a cholesky decomposition on this matrix fails, as expected. cholStatus <- try(u <- chol(origMat), silent = FALSE) cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)  Here, I use the method of Rebonato and Jackel (2000), as elaborated by Brissette et al. (2007), to fix the correlation matrix. As per the method, replace the negative eigenvalues with 0 (or a small positive number as Brissette et al. 2007 suggest), then normalize the new vector. # fix the correl matrix newMat <- origMat iter <- 0 while (cholError) { iter <- iter + 1 cat("iteration ", iter, "\n") # replace -ve eigen values with small +ve number newEig <- eigen(newMat) newEig2 <- ifelse(newEig$values < 0, 0, newEig$values) # create modified matrix eqn 5 from Brissette et al 2007, inv = transp for # eig vectors newMat <- newEig$vectors %*% diag(newEig2) %*% t(newEig$vectors) # normalize modified matrix eqn 6 from Brissette et al 2007 newMat <- newMat/sqrt(diag(newMat) %*% t(diag(newMat))) # try chol again cholStatus <- try(u <- chol(newMat), silent = TRUE) cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE) }  ## iteration 1 ## iteration 2   # final check eigen(newMat)$values

## [1]  2.290e+00  7.096e-01 -1.332e-15


# Unresolved Issue(?)

However, as you can see, the third eigenvalue is still negative (but very close to zero). The “chol” function in R is not giving an error probably because this negative eigenvalue is within the “tolerance limits”. I would like to know what these “tolerance limits” are.