# Fixing non positive definite correlation matrices using R

**a modeler's tribulations, gopi goteti's web log**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

# Problem

When a correlation or covariance matrix is not positive definite (i.e., in instances when some or all eigenvalues are negative), a cholesky decomposition cannot be performed. Sometimes, these eigenvalues are very small negative numbers and occur due to rounding or due to noise in the data. In simulation studies a known/given correlation has to be imposed on an input dataset. In such cases one has to deal with the issue of making a correlation matrix positive definite. Following are papers in the field of stochastic precipitation where such matrices are used.

- FP Brissette, M Khalili, R Leconte, Journal of Hydrology, 2007, “Efficient stochastic generation of multi-site synthetic precipitation data”
- GA Baigorria, JW Jones, Journal of Climate, 2010, “GiST: A stochastic model for generating spatially and temporally correlated daily rainfall data”
- M Mhanna and W Bauwens, International Journal of Climatology, 2012, “A stochastic space-time model for the generation of daily rainfall in the Gaza Strip”

# A Solution

The paper by Rebonato and Jackel, “The most general methodology for creating a valid correlation matrix for risk management and option pricing purposes”, Journal of Risk, Vol 2, No 2, 2000, presents a methodology to create a positive definite matrix out of a non-positive definite matrix.

Below is my attempt to reproduce the example from Rebonato and Jackel (2000). The correlation matrix below is from the example.

```
origMat <- array(c(1, 0.9, 0.7, 0.9, 1, 0.3, 0.7, 0.3, 1), dim = c(3,
3))
origEig <- eigen(origMat)
origEig$values
```

```
## [1] 2.296728 0.710625 -0.007352
```

As you can see, the third eigenvalue is negative. Trying a cholesky decomposition on this matrix fails, as expected.

```
cholStatus <- try(u <- chol(origMat), silent = FALSE)
cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)
```

Here, I use the method of Rebonato and Jackel (2000), as elaborated by Brissette et al. (2007), to fix the correlation matrix. As per the method, replace the negative eigenvalues with 0 (or a small positive number as Brissette et al. 2007 suggest), then normalize the new vector.

```
# fix the correl matrix
newMat <- origMat
iter <- 0
while (cholError) {
iter <- iter + 1
cat("iteration ", iter, "\n")
# replace -ve eigen values with small +ve number
newEig <- eigen(newMat)
newEig2 <- ifelse(newEig$values < 0, 0, newEig$values)
# create modified matrix eqn 5 from Brissette et al 2007, inv = transp for
# eig vectors
newMat <- newEig$vectors %*% diag(newEig2) %*% t(newEig$vectors)
# normalize modified matrix eqn 6 from Brissette et al 2007
newMat <- newMat/sqrt(diag(newMat) %*% t(diag(newMat)))
# try chol again
cholStatus <- try(u <- chol(newMat), silent = TRUE)
cholError <- ifelse(class(cholStatus) == "try-error", TRUE, FALSE)
}
```

```
## iteration 1
## iteration 2
```

```
# final check
eigen(newMat)$values
```

```
## [1] 2.290e+00 7.096e-01 -1.332e-15
```

# Unresolved Issue(?)

However, as you can see, the third eigenvalue is still negative (but very close to zero). The “chol” function in R is not giving an error probably because this negative eigenvalue is within the “tolerance limits”. I would like to know what these “tolerance limits” are.

**leave a comment**for the author, please follow the link and comment on their blog:

**a modeler's tribulations, gopi goteti's web log**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.