# Fixing non positive definite correlation matrices using R

September 3, 2012
By

(This article was first published on a modeler's tribulations, and kindly contributed to R-bloggers)

# Problem

When a correlation or covariance matrix is not positive definite (i.e., in instances when some or all eigenvalues are negative), a cholesky decomposition cannot be performed. Sometimes, these eigenvalues are very small negative numbers and occur due to rounding or due to noise in the data. For example, see the paper by Rebonato and Jackel, “The most general methodology for creating a valid correlation matrix for risk management and option pricing purposes”, Journal of Risk, Vol 2, No 2, 2000.

If you have a hydrology backgrund, below are some papers where such matrices are used.

• FP Brissette, M Khalili, R Leconte, Journal of Hydrology, 2007, “Efficient stochastic generation of multi-site synthetic precipitation data”
• GA Baigorria, JW Jones, Journal of Climate, 2010, “GiST: A stochastic model for generating spatially and temporally correlated daily rainfall data”

# A Solution

Below is my attempt to reproduce the example from Rebonato and Jackel (2000). The correlation matrix below is from the example.

``````
origMat <- array(c(1, 0.9, 0.7, 0.9, 1, 0.3, 0.7, 0.3, 1), dim = c(3,
3))

origEig <- eigen(origMat)
origEig\$values
``````
``````## [1]  2.296728  0.710625 -0.007352
``````

As you can see, the third eigenvalue is negative. Trying a cholesky decomposition on this matrix fails, as expected.

``````cholStatus <- try(u <- chol(origMat), silent = FALSE)
cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)
``````

Here, I use the method of Rebonato and Jackel (2000), as elaborated by Brissette et al. (2007), to fix the correlation matrix. As per the method, replace the negative eigenvalues with 0 (or a small positive number as Brissette et al. 2007 suggest), then normalize the new vector.

``````# fix the correl matrix
newMat <- origMat

iter <- 0
while (cholError) {

iter <- iter + 1
cat("iteration ", iter, "\n")

# replace -ve eigen values with small +ve number
newEig <- eigen(newMat)
newEig2 <- ifelse(newEig\$values < 0, 0, newEig\$values)

# create modified matrix eqn 5 from Brissette et al 2007, inv = transp for
# eig vectors
newMat <- newEig\$vectors %*% diag(newEig2) %*% t(newEig\$vectors)

# normalize modified matrix eqn 6 from Brissette et al 2007
newMat <- newMat/sqrt(diag(newMat) %*% t(diag(newMat)))

# try chol again
cholStatus <- try(u <- chol(newMat), silent = TRUE)
cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)
}
``````
``````## iteration  1
## iteration  2
``````
``````
# final check
eigen(newMat)\$values
``````
``````## [1]  2.290e+00  7.096e-01 -1.332e-15
``````

# Unresolved Issue(?)

However, as you can see, the third eigenvalue is still negative (but very close to zero). The “chol” function in R is not giving an error probably because this negative eigenvalue is within the “tolerance limits”. I would like to know what these “tolerance limits” are.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...