June 19, 2012
By

(This article was first published on Systematic Investor » R, and kindly contributed to R-bloggers)

I came across a very descriptive visualization of the Factor Attribution that I will replicate today. There is the Three Factor Rolling Regression Viewer at the mas financial tools web site that performs rolling window Factor Analysis of the “three-factor model” of Fama and French. The factor returns are available from the Kenneth R French: Data Library. I recommend reading the Efficient Frontier: Rolling Your Own: Three-Factor Analysis by W. Bernstein for a step by step instructions.

Let’s start by loading historical returns for the Vanguard Small Cap Value Index (VISVX) and aligning them with Fama/French Monthly Factors. Please note I wrote a helper function, get.fama.french.data(), to simplify process of loading and analyzing factor data from the Kenneth R French: Data Library.

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
source(con)
close(con)

#*****************************************************************
#******************************************************************
tickers = 'VISVX'

data <- new.env()
getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T)
for(i in ls(data)) {

period.ends = endpoints(temp, 'months')
period.ends = period.ends[period.ends > 0]

# reformat date to match Fama French Data
monthly.dates = as.Date(paste(format(index(temp)[period.ends], '%Y%m'),'01',sep=''), '%Y%m%d')
data[[i]] = make.xts(coredata(temp[period.ends,]), monthly.dates)
}

# Fama/French factors

data\$factors = factors\$data / 100
bt.prep(data, align='remove.na', dates='1994::')

Next, let’s run Factor Attribution over all available data:

#*****************************************************************
#******************************************************************
prices = data\$prices
nperiods = nrow(prices)
dates = index(data\$prices)

# compute simple returns
hist.returns = ROC(prices[,tickers], type = 'discrete')
hist.returns = hist.returns - data\$factors\$RF
colnames(hist.returns) = 'fund'
hist.returns = cbind(hist.returns, data\$factors\$Mkt.RF,
data\$factors\$SMB, data\$factors\$HML)

fit.all = summary(lm(fund~Mkt.RF+SMB+HML, data=hist.returns))
estimate.all = c(fit.all\$coefficients[,'Estimate'], fit.all\$r.squared)
std.error.all = c(fit.all\$coefficients[,'Std. Error'], NA)

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0006828  0.0012695  -0.538    0.591
Mkt.RF       0.9973980  0.0262881  37.941   <2e-16 ***
SMB          0.5478299  0.0364984  15.010   <2e-16 ***
HML          0.6316528  0.0367979  17.165   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Multiple R-squared: 0.9276,     Adjusted R-squared: 0.9262

Next, let’s run a 36 month rolling window Factor Attribution:

#*****************************************************************
#******************************************************************
window.len = 36

colnames = spl('alpha,MKT,SMB,HML,R2')
estimate = make.xts(matrix(NA, nr = nperiods, len(colnames)), dates)
colnames(estimate) = colnames
std.error = estimate

# main loop
for( i in window.len:nperiods ) {
window.index = (i - window.len + 1) : i

fit = summary(lm(fund~Mkt.RF+SMB+HML, data=hist.returns[window.index,]))
estimate[i,] = c(fit\$coefficients[,'Estimate'], fit\$r.squared)
std.error[i,] = c(fit\$coefficients[,'Std. Error'], NA)

if( i %% 10 == 0) cat(i, '\n')
}

Finally, let’s re-create the timeseries and histogram charts as presented by Three Factor Rolling Regression Viewer.

#*****************************************************************
# Reports
#******************************************************************
layout(matrix(1:10,nc=2,byrow=T))

for(i in 1:5) {
#-------------------------------------------------------------------------
# Time plot
#-------------------------------------------------------------------------
est = estimate[,i]
est.std.error = ifna(std.error[,i], 0)

plota(est,
ylim = range( c(
range(est + est.std.error, na.rm=T),
range(est - est.std.error, na.rm=T)
)))

polygon(c(dates,rev(dates)),
c(coredata(est + est.std.error),
rev(coredata(est - est.std.error))),

est = estimate.all[i]
est.std.error = std.error.all[i]

polygon(c(range(dates),rev(range(dates))),
c(rep(est + est.std.error,2),
rep(est - est.std.error,2)),

abline(h=0, col='blue', lty='dashed')

abline(h=est, col='blue')

plota.lines(estimate[,i], type='l', col='red')

#-------------------------------------------------------------------------
# Histogram
#-------------------------------------------------------------------------
par(mar = c(4,3,2,1))
hist(estimate[,i], col='red', border='gray', las=1,
xlab='', ylab='', main=colnames(estimate)[i])
abline(v=estimate.all[i], col='blue', lwd=2)
}

In the next post, I plan to replicate the Style charts and provide more examples.

To view the complete source code for this example, please have a look at the three.factor.rolling.regression() function in bt.test.r at github.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...