Examples of Current Major Market Clusters

January 11, 2013
By

(This article was first published on Systematic Investor » R, and kindly contributed to R-bloggers)

I want to follow up and provide a bit more details to the excellent “A Visual of Current Major Market Clusters” post by David Varadi.

Let’s first load historical for the 10 major asset classes:

  • Gold ( GLD )
  • US Dollar ( UUP )
  • S&P500 ( SPY )
  • Nasdaq100 ( QQQ )
  • Small Cap ( IWM )
  • Emerging Markets ( EEM )
  • International Equity ( EFA )
  • Real Estate ( IYR )
  • Oil ( USO )
  • Treasurys ( TLT )

###############################################################################
# Load Systematic Investor Toolbox (SIT)
# http://systematicinvestor.wordpress.com/systematic-investor-toolbox/
###############################################################################
setInternet2(TRUE)
con = gzcon(url('http://www.systematicportfolio.com/sit.gz', 'rb'))
    source(con)
close(con)

	#*****************************************************************
	# Load historical data for ETFs
	#****************************************************************** 
	load.packages('quantmod')

	tickers = spl('GLD,UUP,SPY,QQQ,IWM,EEM,EFA,IYR,USO,TLT')

	data <- new.env()
	getSymbols(tickers, src = 'yahoo', from = '1900-01-01', env = data, auto.assign = T)
		for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T)
		
	bt.prep(data, align='remove.na')

Next let’s use the historical returns over the past year to compute correlations between all asset classes and group assets into 4 clusters:

	#*****************************************************************
	# Create Clusters
	#****************************************************************** 
	# compute returns
	ret = data$prices / mlag(data$prices) - 1
		ret = na.omit(ret)		

	# setup period and method to compute correlations
	dates = '2012::2012'
	method = 'pearson'	# kendall, spearman
	
	correlation = cor(ret[dates], method = method)    
        dissimilarity = 1 - (correlation)
        distance = as.dist(dissimilarity)
        	
	# find 4 clusters      
	xy = cmdscale(distance)
	fit = kmeans(xy, 4, iter.max=100, nstart=100)
	
	#*****************************************************************
	# Create Plot
	#****************************************************************** 	
	load.packages('cluster')
	clusplot(xy, fit$cluster, color=TRUE, shade=TRUE, labels=3, lines=0, plotchar=F, 
		main = paste('Major Market Clusters over', dates), sub='')	

plot1

There are 4 clusters: TLT, GLD, UUP, and Equities / Oil / Real Estate. You can see assigned clusters by executing

	fit$cluster

This works quite well, but we have a number of things to explore:

  • how to select number of clusters
  • what correlation measure to use i.e. pearson, kendall, spearman
  • what look back to use i.e. 1 month / 6 months / 1 year
  • what frequency of data to use i.e daily / weekly / monthly

In the next post I will provide some ideas how to select number of clusters.

To view the complete source code for this example, please have a look at the bt.cluster.visual.test() function in bt.test.r at github.


To leave a comment for the author, please follow the link and comment on his blog: Systematic Investor » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.