Mini-tutorial for Quandl: How to access financial data with R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
by Joseph Rickert
Quandl.com, the open source website for financial data, made rapid progress earlier this year in becoming an R friendly source for financial time series data. Tammer Kamel, Quandl’s founder introduced the site on Revolutions blog in late February as a “search engine” for numerical data and explained how Quandl’s “Q-bot” can take data from almost any publisher that shape it into a standard from. Then in early in March we noted that the quandl package is available in CRAN. Today, I would like to provide a mini-tutorial on accessing the data. Getting time series data couldn't be easier.
The main page lists a column of 10 “Markets”. Clicking on one, say “Currencies”, will take you to a page with the title ”Exchange Rates versus USD” The first table on the page lists major currencies. All of the numbers in the table are live. Clicking on one will take you to the page devoted to a particular time series. For example, clicking on the entry in the column labeled “Inverted” in the row for the Japanese Yen (JPY) will take you to a page to devoted to the series of exchange rates for the Japanese Yen in US Dollars. Clicking on the red “Download” button at the top of the page will take you to the download wizard, and selecting R will yield a dropdown box with a line of R code to fetch the series.
Just copy this line into your script and you can read the data from Quandl.
For more serious use, sign up on the main page (it's free) and you will get an authentication token that can be used to access the Quandl API under program control. The agreement says this is good for a 100 accesses a day, but that if you need more email Quandl and ask. Now you can access any Quandl time series by using the Quandl code for the series in your program. This code appears on the page for the series. For example, on the page for the rate for the Japanese Yen we looked at above the code is given in a table towards the right part of the page.
All of this and more is explained on the API page. The following is a simple R script to retrieve the currency rates in US dollars for the 22 major currencies listed on the Exchange rates page. It demonstrates the basics of calling the Quandl API and selecting a subset of the information there, and then plots a few of the rates as a time series.
# Code for Quandl Blog Post # June 2013 # Joseph B Rickert # This code uses the Quandl API to pull down # exchange rate information from the Quandl currencies page # The first step in running this script is to sign up at www.Quandl.com # to get an authentication token #token <- 'Token_From_Quandl' # Sign up with Quandl to get a token #----------------------------------------------------- library(Quandl) # Quandl package library(ggplot2) # Package for plotting library(reshape2) # Package for reshaping data # Quandl.auth(token) # Authenticate your token # Build vector of currencies currencies <- c("ARS","AUD","BRL","CAD","CHF", "CNY","DKK","EUR","GBP","IDR", "ILS","INR","JPY","MXN","MYR", "NOK","NZD","PHP","RUB","SEK", "THB","TRY") # Function to fetch major currency rates rdQcurr <- function(curr){ # Construct Quandl code for first currency codes <- paste("QUANDL/",curr,"USD",sep="") for(i in 1:length(curr)){ if (i == 1){ # Select the date from the first currency d <- Quandl(codes[1],start_date="2000-01-01",end_date="2013-06-07" )[,1] A <- array(0,dim=c(length(d),length(codes))) # Get the rate fom the first curency A[,1] <- Quandl(codes[1],start_date="2000-01-01",end_date="2013-06-07" )[,2] } else{ # Just get the rates for the remaining currencies A[,i] <- Quandl(codes[i],start_date="2000-01-01",end_date="2013-06-07" )[,2] } } df <- data.frame(d,A) names(df) <- c("DATE",curr) return(df) } # rates <- rdQcurr(currencies) # Fetch the currency rates rates$DATE <- as.Date(rates$DATE) # Make DATE into type Date # rates4 <- rates[,c(1,3:6)] # Pick out some rates to plot meltdf <- melt(rates4,id="DATE") # Shape data for plottting # ggplot(meltdf,aes(x=DATE,y=value,colour=variable,group=variable)) + geom_line() + scale_colour_manual(values=1:22)+ ggtitle("Major Currency Exchange Rates in USD")
Note that I first tried the script late on a weekend night without specifying the end date in the Quandl call and it ran just fine. But, then it failed on a weekday morning because the time series lengths were different. I suspect that the various time series are updated at different times during the day as information becomes available. So, writing some serious production code will likely require some experimentation to become familiar with the nuances of data update process.
The Quandl R page has nicely laid out documentation on the Quandl package. Also note that the Quandl API will also allow you to download data as a time series object. The following code fetches the rates for the Japanese Yen as an xts object. Everything behaves as you would expect.
JPY = Quandl("QUANDL/USDJPY",start_date="2000-01-01",end_date="2013-06-07", type="xts") class(JPY) "xts" "zoo" head(JPY) Rate High (est) Low (est) 2000-01-03 102.161 103.37 100.96 2000-01-04 101.835 0.00 0.00 2000-01-05 102.635 0.00 0.00 2000-01-06 103.925 105.16 102.70 2000-01-07 104.891 106.11 103.68 2000-01-10 105.151 106.40 103.92
I expect that Quandl data sets are already being used for some serious work. Certainly, anybody contemplating teaching an R based time series course would find a rich set of examples here.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.