Downloading Option Chain Data from Google Finance in R: An Update

January 13, 2015
By

(This article was first published on Exegetic Analytics » R, and kindly contributed to R-bloggers)

I recently read an article which showed how to download Option Chain data from Google Finance using R. Interestingly, that article appears to be a close adaption of another article which does the same thing using Python.

While playing around with the code from these articles I noticed a couple of things that might benefit from minor tweaks. Before I look at those though, it’s worthwhile pointing out that there already is a function in quantmod for retrieving Option Chain data from Yahoo! Finance. What I am doing here is thus more for my own personal edification (but hopefully you will find it interesting too!).

Background

An Option Chain is just a list of all available options for a particular security spanning a range of expiration dates.

The Code

First we need to load a few packages which facilitate the downloading, parsing and manipulation of the data.

> library(RCurl)
> library(jsonlite)
> library(plyr)

We’ll be retrieving the data in JSON format. Somewhat disturbingly the JSON data from Google Finance does not appear to be fully compliant with the JSON standards because the keys are not quoted. We’ll use a helper function which will run through the data and insert quotes around each of the keys. The original code for this function looped through a list of key names. This is a little inefficient and would also be problematic if additional keys were introduced. We’ll get around that by using a different approach which avoids stipulating key names.

> fixJSON <- function(json){
+   gsub('([^,{:]+):', '"\1":', json)
+ }

To make the download function more concise we’ll also define two URL templates.

> URL1 = 'http://www.google.com/finance/option_chain?q=%s&output=json'
> URL2 = 'http://www.google.com/finance/option_chain?q=%s&output=json&expy=%d&expm=%d&expd=%d'

And finally the download function itself, which proceeds through the following steps for a specified ticker symbol:

  1. downloads summary data;
  2. extracts expiration dates from the summary data and downloads the options data for each of those dates;
  3. concatenates these data into a single structure, neatens up the column names and selects a subset.
> getOptionQuotes <- function(symbol){
+   url = sprintf(URL1, symbol)
+   #
+   chain = fromJSON(fixJSON(getURL(url)))
+   #
+   options = mlply(chain$expirations, function(y, m, d) {
+     url = sprintf(URL2, symbol, y, m, d)
+     expiry = fromJSON(fixJSON(getURL(url)))
+     #
+     expiry$calls$type = "Call"
+     expiry$puts$type  = "Put"
+     #
+     prices = rbind(expiry$calls, expiry$puts)
+     #
+     prices$expiry = sprintf("%4d-%02d-%02d", y, m, d)
+     prices$underlying.price = expiry$underlying_price
+     #
+     prices
+   })
+   #
+   options = cbind(data.frame(symbol), rbind.fill(options))
+   #
+   names(options)1 = c("price", "bid", "ask", "open.interest")
+   #
+   for (col in c("strike", "price", "bid", "ask")) options[, col] = as.numeric(options[, col])
+   options[, "open.interest"] = suppressWarnings(as.integer(options[, "open.interest"]))
+   #
+   options[, c(1, 16, 15, 6, 10, 11, 17, 14, 12)]
+ }

Results

Let’s give it a whirl. (The data below were retrived on Saturday 10 January 2015).

> AAPL = getOptionQuotes("AAPL")
> nrow(AAPL)
[1] 1442

This is what the resulting data look like, with all available expiration dates consolidated into a single table:

> head(AAPL)
  symbol type     expiry price   bid   ask underlying.price strike open.interest
1   AAPL Call 2015-01-17 82.74 84.00 84.35           112.01  27.86           505
2   AAPL Call 2015-01-17 83.75 83.20 83.70           112.01  28.57          1059
3   AAPL Call 2015-01-17 84.75 82.60 82.90           112.01  29.29            13
4   AAPL Call 2015-01-17 81.20 81.80 82.40           112.01  30.00            29
5   AAPL Call 2015-01-17 83.20 81.10 81.65           112.01  30.71           150
6   AAPL Call 2015-01-17 79.75 80.35 80.75           112.01  31.43           396
> tail(AAPL)
     symbol type     expiry price   bid   ask underlying.price strike open.interest
1437   AAPL  Put 2017-01-20 47.57 46.45 48.20           112.01    150           108
1438   AAPL  Put 2017-01-20 51.00 50.45 52.30           112.01    155            72
1439   AAPL  Put 2017-01-20 52.45 54.55 56.50           112.01    160           203
1440   AAPL  Put 2017-01-20 58.55 58.65 60.75           112.01    165            76
1441   AAPL  Put 2017-01-20 67.90 63.15 64.50           112.01    170           167
1442   AAPL  Put 2017-01-20 68.00 67.35 68.25           112.01    175           239

There is a load of data there. To get an idea of what it looks like we can generate a couple of plots. Below is the Open Interest as a function of Strike Price across all expiration dates. The underlying price is indicated by the vertical dashed line. As one might expect, the majority of interest is associated with the next expiration date on 17 January 2015.

open-interest-strike-price-AAPL

It’s pretty clear that this is not the optimal way to look at these data and I would be extremely interested to hear from anybody with a suggestion for a better visualisation. Trying to look at all of the expiration dates together is probably the largest problem, so let’s focus our attention on those options which expire on 17 January 2015. Again the underlying price is indicated by a vertical dashed line.

open-interest-premium-strike-price-AAPL

This is the first time that I have seriously had a look at options data, but I will now readily confess to being intrigued. Having the data readily available, there is no reason not to explore further. Details to follow.

To leave a comment for the author, please follow the link and comment on their blog: Exegetic Analytics » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)