Hello and welcome to Joe’s Data Diner’s first ever post!
Today, I will touch on both R and Finance, but I’ll try and make it accesible for those with an interest in either and not just Quants like myself!
Almost everyone is now aware that asset correlation increases in times of stress. This topic has made its way into the most esteemed journals and even the popular press. However, it remains a defining feature of the ’07-’08 financial crisis and, for those working in the Credit markets, it remains a topic of paramount importance.
My idea of graphing the time evolution of the correlation structure between assets is not revolutionary. However, my aim here is to show how easy it is to reproduce such graphics using only the free R software.
In Part I, I will produce a static graph showing quarterly measurements of the correlation structure and, in Part II, I will create a video showing the daily evolution in the corrrelation.
As some of you may be more interested in the results than the R code, let’s look at them straight away!
The assets shown are as follow:
SPY: S&P 500
QQQ: Nasdaq 100
EEM: Emerging Markets
IWM: Russel 2000
EFA: EAFE (Europe, Australasia and Far East)
TLT: 20 Year Treasury
IYR: U.S. Real Estate
Even from the small thumbnail, the increase in correlation during the crisis, and the decrease by 2013 are immediately evident. I’m certain, a detailed study could yield many more observations – please, feel free to add coments, if any you observe anything!
However, I will now jump into the code! I’m new to blogging in general and R-blogging in particular, so I’ve decided well commented code is probably the clearest way to explain the majority of the process while returning to prose to discuss some of the more interesting coding decisions.
First, let’s get the data. Thanks to Systematic Investor for teaching (via his blog) me how to use QuantMod:
Next let’s compute some correlations! Later we’ll use a rolling window to produce daily updates for the video, but for now lets look at the correlation matrix for each quarter:
And now we’re ready to use another of Hadley’s packages the famous ggplot (2) to create the first iteration of the graph:
I make two small changes to improve the visual impact. First, the series are re-ordered to put the ones which tend to be negatively-correlated to one side, making for a less cluttered looking graph. Secondly, I feel that the important range of correlations from +0.5 – +1.0 is somewhat difficult to analyse as the colours are very similar. Inspired by a topographical map of Lanzarote (where I’ve just been cycling), I thought I’d change the very top of the correlation range to purple, increasing the definition in this correlation range:
and, finally, we have the graph shown at the top! However, I don’t like how we’ve sacrificed the pleasant visuals of the mainly red palette to obtain this greater definition. If anyone has any suggestions how to improve this, please let me know!
I hope you enjoyed the opening night at Joe’s Data Diner; tips (non-monetary) are always welcome and I hope you’ll return to see the film in”The Financial Crisis on Tape Part II” in the next couple of weeks!