Building diversified portfolios with R

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A common approach to reducing risk associated with financial portfolios is diversification. A portfolio made of components that are all highly correlated with each other — a portfolio composed solely of financial stocks, for example — is risky, because if there's a wide-spread crisis that affects the banking sector, all components of the portfolio will tank at once, together. This is bad. A way to avoid risks like this is to try and choose components that are as uncorrelated (or even anti-correlated) as possible: that way, if one sector tanks, the entire portfolio isn't brought down. 

The classical way to deal with this is problem is Markowitz mean-variance portfolio optimization: for a given level of risk (say, 12%), find the portfolio that maximizes the expected return, given the historic correlations between the different potential components (treasury bonds, equities, index funds, commodities, etc). Choosing a higher or lower level of risk will result in a different mix of components: generally more on the equities side for the higher risk levels, more in treasuries for the lower risk levels.

Portfolio managers often set constraints on the amount of stocks to be allocated specific sectors (say, 20% in finance equities and 10% in municipal bonds). Given those constraints, the classical mean-variance optimization process can still be used, but the set of solutions is constrained to those portfolios that meet the sector allocations. Nonetheless, the individual assets in those sectors are still considered independently in the optimization process.

A recent paper suggests a better approach might be to minimize not overall risk, but instead the average correlation of the components within each sector. The Systematic Investor blog shows that it's easy to implement a criterion like this in the R language:

portfolio.sigma = sqrt( t(weight) %*% assets.cov %*% weight )
mean( ( weight %*% assets.cov ) / ( assets.sigma * portfolio.sigma ) )

You can then use one of R's nonlinear solvers — they use Rdonlp2 — to maximize the equations and return the optimal portfolios for different levels of risk. (Rhe R code to do this is available at github.) Here are their results for standard mean-variance portfolios (at the top), and minimum average correlation portfolios at the bottom:

In each case, read the vertical line above a given level of risk to see how the optimal portfolio is allocated. At the lower risk levels, the average-correlation portfolio includes gold (GLD) and 20-year treasuries (TLT); at higher risk levels emerging markets securities (EEM) get mixed in as well.

For the full details of average-correlation portfolios and their implementation in R, see the blog post at Systematic Investor linked below.

Systematic Investor: The Most Diversified or The Least Correlated Efficient Frontier

To leave a comment for the author, please follow the link and comment on their blog: Revolutions. offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)