Before we can do some quant analysis, we need to get some relevant data – and the web is a good place to start. Sometimes the data can be downloaded in a standard format like .csv files or available via an API e.g. http://www.quandl.com but often you’ll need to scrape data directly from web pages.
In this post I’ll show how to obtain the US Federal Reserve FOMC Announcement dates (i.e. those when a statement is published after the meeting) from their web page http://www.federalreserve.gov/monetarypolicy/fomccalendars.htm. At the time of writing, this web page had dates from 2009 onward.
First, install and load the httr and XML R packages.
install.packages(c("httr", "XML"), repos = "http://cran.us.r-project.org")
Next, run the following R code.
# get and parse web page content
webpage <- content(GET(
as = "text")
xhtmldoc <- htmlParse(webpage)
# get statement urls and sort them
statements <- xpathSApply(xhtmldoc, "//td[@class='statement2']/a", xmlGetAttr,
statements <- sort(statements)
# get dates from statement urls
fomcdates <- sapply(statements, function(x) substr(x, 28, 35))
fomcdates <- as.Date(fomcdates, format = "%Y%m%d")
# save results in working directory
save(list = c("statements", "fomcdates"), file = "fomcdates.RData")
Finally, check the results by looking at their structures and first few values.
# check data
And you should see output similar to this below.
## chr [1:49] "/newsevents/press/monetary/20090128a.htm" ...
##  "/newsevents/press/monetary/20090128a.htm"
##  "/newsevents/press/monetary/20090318a.htm"
##  "/newsevents/press/monetary/20090429a.htm"
##  "/newsevents/press/monetary/20090624a.htm"
##  "/newsevents/press/monetary/20090812a.htm"
##  "/newsevents/press/monetary/20090923a.htm"
## Date[1:49], format: "2009-01-28" "2009-03-18" "2009-04-29" "2009-06-24" ...
##  "2009-01-28" "2009-03-18" "2009-04-29" "2009-06-24" "2009-08-12"
##  "2009-09-23"
So what can we do with this data? Here are a few ideas:
- Go deeper and download the actual statements and use a machine learning algorithm (Natural Language Processing (NLP)) to analyze the statement e.g. positive or negative sentiment. Actually, this is quite a complex task but is something on my list of research topics in 2015…
- Collect price data e.g. Treasury yields or S&P500 and do some visual / initial exploratory analysis around the FOMC announcement dates
- Conduct an event study like the academics do to identify whether or not there are any statistically significant patterns around these dates
- Incorporate the dates into a trading or investment program and backtest to see whether there are economically significant patterns i.e. tradeable alpha opportunities