Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Most of you are probably familiar with the covariance matrix. Its less known brother, the semicovariance matrix, might however be new to you. The semicovariance matrix is pretty much like a covariance matrix, with the difference that it is computed accounting only for the variability below a certain benchmark, which is set by the investor (e.g. negative returns, or returns lower than the risk-free rate or any other cut-off the investor sets).

Why would you want to use it? Well, if you think about it, it’s the most natural thing to do: why consider all variability as risk, while in fact it is only the possibility of incurring into losses (or at least to earn less than expected) that worries you? Returns higher than the mean increase the volatility, but the possibility of achieving them can hardly be classified as risk. Therefore, it makes more sense to measure risk using not the standard deviation (which is the square root of the variace), but rather the downside deviation (the square root of the semivariance). If asset returns are symmetrically distributed, targeting either one or the other measure of risk makes no difference (well, technically, that’s not always completely true, but let’s make this almost innocuos assumption for the sake of simplicity). That’s the logic behind the strategies that target the downside risk, and using the semicovariance matrix instead of the covariance matrix turns mean-variance optimization into mean-semivariance optimization, which is one way to target downside risk.

However, the semicovariance matrix suffers from one problem: endogeneity. Since it is computed by looking at the periods in which the portfolio underperforms the benchmark, but this set of periods is affected by the weights of the portfolio, any change in the portfolio weights changes the elements of the semicovariance matrix! This makes optimization problems that use the semicovariance matrix intractable. Luckily, there is a way around this problem. This paper describes a simple solution that consists into computing an approximated exogenous semicovariance matrix by looking at when the single assets, and not the portfolio as a whole, underperform the benchmark. The paper also describes in much greater detail what a semicovariance matrix is, so you should definitely give it a look.

The question now is: how can you compute this matrix in R? There are no built-in functions in R that perform this task, so let me show you how to write one. Let’s call this function “scov”; the following code will give you what you need:

scov <- function(mydata,B){ #mydata = matrix of past returns; B = chosen benchmark
D <- mydata - B #subtract the benchmark from the returns
Z <- matrix(data=0, nrow=nrow(mydata), ncol=ncol(mydata)) #matrix of zeros
D <- pmin(D,Z) #pick the minimum value between D and 0
SM <- (nrow(mydata))^(-1)*t(D)%*%D  #approximated semicovariance matrix
return(SM)
}

Done! It wasn’t difficult, right? You can then use SM instead of the covariance matrix, in order to compute a mean-variance or minimum semivariance portfolio. Still, you might want to take it easy, as this approximation of the semicovariance matrix is not always a good approximation

#### Bibliography

1. Estrada, J. (2008), Mean-Semivariance Optimization: A Heuristic Approach, Journal of Applied Finance, 18(1)
2. Cheremushkin, S. V. (2009), Why D-CAPM is a Big Mistake? The Incorrectness of the Cosemivariance Statistics, SSRN Electronic Journal