Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Today, we will continue getting familiar with data from Quandl, but will also devote more time to expanding our dygraphs toolkit. We will be building up a data visualization in discrete pieces, which isn’t very efficient, but will make things easier when we move this project into production as a Shiny app. From a substantive perspective, we will examine the relationship between the price ratio of copper-gold and 10-year Treasure yields.

Why do we care about the copper-gold price ratio and Treasury yields? First, both Jeff Gundlach (in this webcast) and Adam Robinson say so, and that’s probably good enough. The theory goes like this:

Copper is a useful industrial metal whose price tends to rise when the global economy expands. As firms produce more goods that require copper as an input, the increased demand for copper drives the price higher. Gold, on the other hand, is a somewhat less useful metal whose prices tends to rise when investors are fearful about a contracting global economy. Gold is a safe-haven investment, and a rising gold price signals either a contracting economy, investor fears of a contracting economy, or both. Gold prices tend to fall when the economy is humming along nicely. Thus, the copper-gold price ratio tends to be increasing when the economy is expanding.

The yield on 10-year Treasury Notes also tends to rise during economic expansion because investors’ inflation expectations are on the rise. When investors expect inflation to increase, they anticipate an uptick in interest rates (for those of you who are too young to remember what an interest rate is, take a look at rates in the mid-1980s) and start to seek higher yields today. That can drive down Treasury prices and increase yields.

Thus, we should observe a positive relationship between the copper-gold price ratio and 10-year yields. Both should be rising and falling based on the state of the world economy. There’s nothing too crazy here, but it’s an interesting relationship to think about and investigate. That’s what we’ll do today!

First, let’s import the relevant time series data from Quandl. We will specify type = "xts" in order to create xts objects and collapse = "daily" because we want daily prices. Note in particular our data sources: CME for copper and gold, and FRED for the 10-year yield. But, we just need the Quandl codes and to be careful about consistent start/end dates for each data set. It’s a liberating feeling to know before starting a project that we’ll be able to find whatever data we need in one source.

Let’s get to it.

library(Quandl)
library(dplyr)
library(xts)
library(lubridate)
library(dygraphs)

copper <- Quandl("CHRIS/CME_HG1", type = "xts", collapse = "daily",
start_date = "2012-01-01", end_date = "2017-02-28")

gold <- Quandl("CHRIS/CME_GC1", type = "xts", collapse = "daily",
start_date = "2012-01-01", end_date = "2017-02-28")

ten_year <- Quandl("FRED/DGS10", type = "xts", collapse = "daily",
start_date = "2012-01-01", end_date = "2017-02-28")

Now we want to combine these into one xts object, which normally would be an easy invocation of merge.xts, but there’s a slight wrinkle. After creating one xts object, we know we need to calculate a ratio of copper/gold. That means that the presence of NAs will be a problem (I found this out the old-fashioned way, by running code that threw an error). Let’s take care of that by prepending the na.locf() function to our merge operation. That function will replace all NAs with the previous day’s value. Why might one of our time series have an NA when another doesn’t? Maybe one of the instruments has a trading holiday, or FRED publishes data for days that are general market holidays.

The downside, of course, is that we are adding observed values that are fictional. I can live with that here since we’re creating a visualization to help understand and gain an intuition about the hypothesized copper-gold yield relationship. If we were using this data to create a trading strategy or algorithm, our NA replacement would be unacceptably fuzzy and a more rigorous decision-making process would be needed to synchronize the data sets.

No matter which approach is taken, the most crucial thing is to explain and make easily reproducible whatever process is used for handling NAs or any data cleaning such as this. Your colleagues, future self, clients and research audience can scrutinize, discount, ignore or applaud it accordingly.

# We're going to merge our 3 xts objects into on xts objects. This would normally be
# very simple with merge.xts but we want to eliminat NAs with na.locf().

copper_gold_tenYear_merged <- na.locf(merge.xts(copper$Settle, gold$Settle, ten_year),
formLast = TRUE)

colnames(copper_gold_tenYear_merged ) <- c("Copper", "Gold", "TenYear")

Now we create a new column to store the ratio of copper gold prices.

# Create the ratio of prices. I multiply copper by 100 to synch with the scale used by
# Gundlach in his presentation.
copper_gold_tenYear_merged$ratio <- (copper_gold_tenYear_merged$Copper*100)/copper_gold_tenYear_merged$Gold We have an xts object that holds four time series, but I want to chart only the copper-gold price ratio and 10-year yields. This next step is not necessary but to be extra clear, I am going to create a new xts object to hold only those two time series. tenYear_ratio <- merge(copper_gold_tenYear_merged$ratio, copper_gold_tenYear_merged\$TenYear)

We use dygraphs to chart that one xts object, and call dySeries() for each of the columns to be included.

dygraph(tenYear_ratio) %>%
dySeries("ratio") %>%
dySeries("TenYear")

That chart is completely unhelpful, of course, because our time series have different scales. I also don’t love the choppiness of the blue 10-year chart. Let’s address these two issues by adding a right-hand side scale and a dyRoller().

The dyRoller() will help smooth out our chart because each plotted point will be an average of the number of periods specified with rollPeriod = X. This won’t affect our xts object, where we store the data, it just makes the chart more readable.

Adding the right-hand-side y-axis requires a few more lines of code. First we need to invoke dyAxis() for the left-hand axis, called “y”. Then we invoke dyAxis() for the right-hand axis, called “y2”. We also need to set independentTicks = TRUE so that we can use a unique, independent value scale for the right-hand side. Next, in our dySeries() call for each time series, we assign each one to an axis. Here we assign “ratio” with axis = 'y', so that the copper-gold price ratio will be on the left-hand scale, and we assign “TenYear” with axis = 'y2', so the 10-year yield will be on the right-hand scale. I also like to include a label with LHS and RHS for each time series.

dygraph(tenYear_ratio) %>%
# Add the rollPeriod for smoothing.
dyRoller(rollPeriod = 3) %>%
# Create two independent axes.
dyAxis("y", label = "Percent (%)") %>%
dyAxis("y2", label = "USD", independentTicks = TRUE) %>%
# Assign each time series to an axis.
dySeries("ratio", axis = 'y', label = "Copper/Gold (LHS)") %>%
dySeries("TenYear", axis = 'y2', label = "10-Year % Yield (RHS)")