Portfolio Optimization in R, Part 3

December 21, 2011
By

(This article was first published on Adventures in Statistical Computing, and kindly contributed to R-bloggers)

In the last post, we discussed the issue of finding the market portfolio by fitting a curve to the found efficient frontier.  Because of kinks in the frontier, we could not guarantee that the fitted curve would be concave in the area of the market portfolio.  A different method was needed.

The method I am using, in this post, is to draw a line between the risk free asset and each point along the frontier.  From that line, calculate the difference between the line and the frontier.  The capital market line should have all frontiers values less than or equal to it.
CMLi <= EFi
The granularity with which we find the frontier means that we might not find the exact market portfolio.  To get around this, I relax the above constraint to find the
Portfolioj that max Count(CMLi < EFi)
I put together the following R function to do this.  Note, I have switched to using Standard Deviation as the risk measure.  This is the more conventional choice.

marketPortfolio = function(merged,rf,returnNames, weightNames,graph=FALSE){
     
      #create an empty data frame for the portfolio weights
      weights = data.frame(t(rep(NA,length(weightNames))))
      colnames(weights) = weightNames
      weights = weights[-1,]

      #Calculate Annualized Returns
      t = table.AnnualizedReturns(merged[,returnNames])
     
      #Range to optimize over
      maxRet = max(t['Annualized Return',]) - .005
      minRet = min(t['Annualized Return',]) + .005
     
      #portfolio.optim cannot have NA values in the time series, filter
      #them out
      m2 = removeNA(merged[,returnNames])
     
      er = NULL
      eStd = NULL
     
      #loop through finding the optimum portfolio for return 
      #levels between the range found above
      #
      #portfolio.optim uses daily returns, so we have to 
      #adjust accordingly
      for (i inseq(minRet,maxRet,length.out=500)){
            pm = 1+i
            pm = log(pm)/255
            opt = portfolio.optim(m2,pm=pm)
            er = c(er,exp(pm*255)-1)
            eStd = c(eStd,opt$ps*sqrt(255))
            w = t(opt$pw)
            colnames(w) = weightNames
            weights = rbind(weights,w)
      }
     
      solution = weights
      solution$er = er
      solution$eStd = eStd
     
      #find the index values for the minimum Std and the max Er
      minIdx = which(solution$eStd == min(solution$eStd))
      maxIdx = which(solution$er == max(solution$er))
     
      #subset the results
      subset = solution[minIdx:maxIdx,c("er","eStd")]
      subset$nAbove = NA
     
      #for each value in the subset, count the number of points
      #that lay below a line drawn through the point and the
      #RF asset
      for (i in seq(1,maxIdx-minIdx+1)){
            toFit = data.frame(er=rf,eStd=0)
            toFit = rbind(toFit,subset[i,c("er","eStd")])
            fit = lm(toFit$er ~ toFit$eStd)
            poly = polynomial(coef = fit$coefficients)
            toPred = subset
            colnames(toPred) = c("actEr","eStd")
            toPred$er = predict(poly,toPred[,"eStd"])
            toPred$diff = toPred$er - toPred$actEr
            subset[i,"nAbove"] = nrow(toPred[which(toPred$diff > 0),])
      }
     
      #get the point of tangency -- where the number of points
      #below the line is maximized
      max = max(subset$nAbove)
      er = subset[which(subset$nAbove == max),"er"]
      eStd = subset[which(subset$nAbove == max),"eStd"]
     
      #index of the market portfolio
      idx = which(solution$er == er & solution$eStd == eStd)
     
      #Draw the line if requested
      if (graph){
            maxStd = max(solution$eStd) + .02
            maxRetg = max(solution$er) + .02
            plot(solution$eStd,
                        solution$er,
                        xlim=c(0,maxStd),
                        ylim=c(0,maxRetg),
                        ylab="Expected Yearly Return",
                        xlab="Expected Yearly Std Dev",
                        main="Efficient Frontier",
                        col="red",
                        type="l",
                        lwd=2)
            abline(v=c(0), col="black", lty="dotted")
            abline(h=c(0), col ="black", lty="dotted")
           
            toFit = data.frame(er=rf,eStd=0)
            toFit = rbind(toFit,solution[idx,c("er","eStd")])
            fit = lm(toFit$er ~ toFit$eStd)
            abline(coef=fit$coefficients,col="blue",lwd=2)
      }
     
      #Return the market portfolio weights and eStd and eR
      out = solution[idx,]
      return (out)
}


Let's test this out using a portfolio of Exxon Mobile (XOM), IBM (IBM), and the mid range Government Bond ETF (IEF).  This assumes you have the importSeries() function defined.


require(polynom)
require(fImport)
require(PerformanceAnalytics)
require(tseries)
require(stats)

from = "2003-01-01"
to = "2011-12-16"
xom = importSeries("xom",from,to)
ibm = importSeries("ibm",from,to)
ief = importSeries("ief",from,to)

merged = merge(xom,ibm)
merged = merge(merged,ief)

vars = c("xom.Return","ibm.Return","ief.Return")
vars2 = c("xom","ibm","ief")

mp = marketPortfolio(merged,.02,vars,vars2,graph=TRUE)
mp

The output in the log is:

        xom    ibm    ief      er    eStd
448 0.09395 0.1378 0.7682 0.07762 0.05996

The created graph is

This portfolio gives us an example of the optimization finding the lower edge of the frontier.  This is not an anomaly. This is why we subset the results to only include the parts from the apex of the frontier (min(StdDev)) and the maximum return.  Because we find the frontier from the min return to the max return, we guarantee order is preserved and only the upper part of the frontier is considered.

A more exact method would be to find the region of the frontier that contains the market portfolio and to either
  1. Grid search to find the market portfolio
  2. Fit a curve to that region and use the method discussed in the previous post.

If there is demand, I can create either of the above methods.  For demonstration purposes, what we have here should suffice.

To leave a comment for the author, please follow the link and comment on his blog: Adventures in Statistical Computing.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.