Last Post For A While, And Two Premium (Cheap) Databases
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This will be my last post on this blog for an indefinite length of time. I will also include an algorithm to query Quandl’s SCF database, which is an update on my attempt to use free futures data from Quandl’s CHRIS database, which suffered from data integrity issues, even after attempts to clean it. Also provided is a small tutorial on using Quandl’s EOD database, for those that take issue with Yahoo’s daily data.
So, first off, the news…as some of you may have already heard from meeting me in person at R/Finance 2015 (which was terrific…interesting presentations, good people, good food, good drinks), effective June 8th, I will be starting employment in the New York area as a quantitative research analyst, and part of the agreement is that this blog becomes an archive of my work, so that I may focus my full energies on my full-time position. That is, it’s not going anywhere, so for those that are recent followers, you now have a great deal of time to catch up on the entire archive, which including this post, will be 62 posts. Topics covered include:
Quantstrat — its basics, and certain strategies coded using it, namely those based off of John Ehlers’s digital signal processing algorithms, along with custom order-sizing functions. A small aside on pairs trading is included as well.
Asset rotation — flexible asset allocation, elastic asset allocation, and most recently, classic asset allocation (aka the constrained critical line algorithm).
Seeking Alpha ideas — both Logical Invest and Harry Long’s work, along with Cliff Smith’s Quarterly Tactical Strategy (QTS). The Logical Invest algorithms do what they set out to do, but in some cases, are dependent on dividends to drive returns. By contrast, Harry Long’s ideas are more thought processes and proofs of concept, as opposed to complete strategies, often depending on ETFs with inception dates after the financial crisis passed (which I used some creativity for to backtest to pre-crisis timelines). I’ve also collaborated with Harry Long privately, and what I’ve seen privately has better risk/reward than anything he has revealed in public, and I find it to be impressive given the low turnover rate of such portfolios.
Volatility trading — XIV and VXX, namely, and a strategy around these two instruments that has done well out of sample.
Other statistical ideas, such as robustness heuristics and change point detection.
Topics I would have liked to have covered but didn’t roll around to:
Most Japanese trading methods — Ichimoku and Heiken Ashi, among other things. Both are in my IKTrading package, I just never rolled around to showing them off. I did cover a hammer trading system which did not perform as I would have liked it to.
Larry Connors’s mean reversion strategies — he has a book on trading ETFs, and another one he wrote before that. The material provided on this blog is sufficient for anyone to use to code those strategies.
The PortfolioAnalytics package — what quantstrat is to signal-based individual instrument trading strategies, PortfolioAnalytics is this (and a lot more) to portfolio management strategies. Although strategies such as equal-weight ranking perform well by some standards, this is only the tip of the iceberg. PortfolioAnalytics is, to my awareness, cutting edge portfolio management technology that can run the gauntlet from quick classic quadratic optimization to cutting-edge random-search global optimization portfolios (albeit those take more time to compute).
Now, onto the second part of this post, which is a pair of premium databases. They’re available from Quandl, and cost $50/month. As far as I’ve been able to tell, the futures database (SCF) data quality is vastly better than the CHRIS database, which can miss (or corrupt) chunks of data. The good news, however, is that free users can actually query these databases (or maybe all databases total, not sure) 150 times in a 60 day period. The futures script sends out 40 of these 150 queries, which may be all that is necessary if one intends to use it for some form of monthly turnover trading strategy.
Here’s the script for the SCF (futures) database. There are two caveats here:
1) The prices are on a per-contract rate. Notional values in futures trading, to my understanding, are vastly larger than one contract, to the point that getting integer quantities is no small assumption.
2) According to Alexios Ghalanos (AND THIS IS REALLY IMPORTANT), R’s GARCH guru, and one of the most prominent quants in the R/Finance community, for some providers, the Open, High, and Low values in futures data may not be based off of U.S. traditional pit trading hours in the same way that OHLC in equities/ETFs are based off of the 9:30 AM – 4:00 PM hours, but rather, extended trading hours. This means that there’s very low liquidity around open in futures, and that the high and low are also based off of these low-liquidity times as well. I am unsure if Quandl’s SCF database uses extended hours open-high-low data (low liquidity), or traditional pit hours (high liquidity), and I am hoping a representative from Quandl will clarify this in the comments section for this post. In any case, I just wanted to make sure that readers are aware of this issue.
In any case, here’s the data fetch for the Stevens Continuous Futures (SCF) database from Quandl. All of these are for front-month contracts, unadjusted prices on open interest cross. Note that in order for this script to work, you must supply quandl with your authorization token, which takes the form of something like this:
Quandl.auth("yourTokenHere") require(Quandl) Quandl.auth("yourTokenHere") authCode <- "yourTokenHere" quandlSCF <- function(code, authCode, from = NA, to = NA) { dataCode <- paste0("SCF/", code) out <- Quandl(dataCode, authCode = authCode) out <- xts(out[, -1], order.by=out$Date) colnames(out)[4] <- "Close" colnames(out)[6] <- "PrevDayOpInt" if(!is.na(from)) { out <- out[paste0(from, "::")] } if(!is.na(to)) { out <- out[paste0("::", to)] } return(out) } #Front open-interest cross from <- NA to <- NA #Energies CME_CL1 <- quandlSCF("CME_CL1_ON", authCode = authCode, from = from, to = to) #crude CME_NG1 <- quandlSCF("CME_NG1_ON", authCode = authCode, from = from, to = to) #natgas CME_HO1 <- quandlSCF("CME_HO1_ON", authCode = authCode, from = from, to = to) #heatOil CME_RB1 <- quandlSCF("CME_RB1_ON", authCode = authCode, from = from, to = to) #RBob ICE_B1 <- quandlSCF("ICE_B1_ON", authCode = authCode, from = from, to = to) #Brent ICE_G1 <- quandlSCF("ICE_G1_ON", authCode = authCode, from = from, to = to) #GasOil #Grains CME_C1 <- quandlSCF("CME_C1_ON", authCode = authCode, from = from, to = to) #Chicago Corn CME_S1 <- quandlSCF("CME_S1_ON", authCode = authCode, from = from, to = to) #Chicago Soybeans CME_W1 <- quandlSCF("CME_W1_ON", authCode = authCode, from = from, to = to) #Chicago Wheat CME_SM1 <- quandlSCF("CME_SM1_ON", authCode = authCode, from = from, to = to) #Chicago Soybean Meal CME_KW1 <- quandlSCF("CME_KW1_ON", authCode = authCode, from = from, to = to) #Kansas City Wheat CME_BO1 <- quandlSCF("CME_BO1_ON", authCode = authCode, from = from, to = to) #Chicago Soybean Oil #Softs ICE_SB1 <- quandlSCF("ICE_SB1_ON", authCode = authCode, from = from, to = to) #Sugar No. 11 ICE_KC1 <- quandlSCF("ICE_KC1_ON", authCode = authCode, from = from, to = to) #Coffee ICE_CC1 <- quandlSCF("ICE_CC1_ON", authCode = authCode, from = from, to = to) #Cocoa ICE_CT1 <- quandlSCF("ICE_CT1_ON", authCode = authCode, from = from, to = to) #Cotton #Other Ags CME_LC1 <- quandlSCF("CME_LC1_ON", authCode = authCode, from = from, to = to) #Live Cattle CME_LN1 <- quandlSCF("CME_LN1_ON", authCode = authCode, from = from, to = to) #Lean Hogs #Precious Metals CME_GC1 <- quandlSCF("CME_GC1_ON", authCode = authCode, from = from, to = to) #Gold CME_SI1 <- quandlSCF("CME_SI1_ON", authCode = authCode, from = from, to = to) #Silver CME_PL1 <- quandlSCF("CME_PL1_ON", authCode = authCode, from = from, to = to) #Platinum CME_PA1 <- quandlSCF("CME_PA1_ON", authCode = authCode, from = from, to = to) #Palladium #Base CME_HG1 <- quandlSCF("CME_HG1_ON", authCode = authCode, from = from, to = to) #Copper #Currencies CME_AD1 <- quandlSCF("CME_AD1_ON", authCode = authCode, from = from, to = to) #Ozzie CME_CD1 <- quandlSCF("CME_CD1_ON", authCode = authCode, from = from, to = to) #Canadian Dollar CME_SF1 <- quandlSCF("CME_SF1_ON", authCode = authCode, from = from, to = to) #Swiss Franc CME_EC1 <- quandlSCF("CME_EC1_ON", authCode = authCode, from = from, to = to) #Euro CME_BP1 <- quandlSCF("CME_BP1_ON", authCode = authCode, from = from, to = to) #Pound CME_JY1 <- quandlSCF("CME_JY1_ON", authCode = authCode, from = from, to = to) #Yen ICE_DX1 <- quandlSCF("ICE_DX1_ON", authCode = authCode, from = from, to = to) #Dollar Index #Equities CME_ES1 <- quandlSCF("CME_ES1_ON", authCode = authCode, from = from, to = to) #Emini CME_MD1 <- quandlSCF("CME_MD1_ON", authCode = authCode, from = from, to = to) #Midcap 400 CME_NQ1 <- quandlSCF("CME_NQ1_ON", authCode = authCode, from = from, to = to) #Nasdaq 100 ICE_RF1 <- quandlSCF("ICE_RF1_ON", authCode = authCode, from = from, to = to) #Russell Smallcap CME_NK1 <- quandlSCF("CME_NK1_ON", authCode = authCode, from = from, to = to) #Nikkei #Bonds/rates CME_FF1 <- quandlSCF("CME_FF1_ON", authCode = authCode, from = from, to = to) #30-day fed funds CME_ED1 <- quandlSCF("CME_ED1_ON", authCode = authCode, from = from, to = to) #3 Mo. Eurodollar/TED Spread CME_FV1 <- quandlSCF("CME_FV1_ON", authCode = authCode, from = from, to = to) #Five Year TNote CME_TY1 <- quandlSCF("CME_TY1_ON", authCode = authCode, from = from, to = to) #Ten Year Note CME_US1 <- quandlSCF("CME_US1_ON", authCode = authCode, from = from, to = to) #30 year bond
In this case, I just can’t give away my token. You’ll have to replace that with your own, which every account has. However, once again, individuals not subscribed to these databases need to pay $50/month.
Lastly, I’d like to show the Quandl EOD database. This is identical in functionality to Yahoo’s, but may be (hopefully!) more accurate. I have never used this database on this blog because the number one rule has always been that readers must be able to replicate all analysis for free, but for those who doubt the quality of Yahoo’s data, they may wish to look at Quandl’s EOD database.
This is how it works, with an example for SPY.
out <- Quandl("EOD/SPY", start_date="1999-12-31", end_date="2005-12-31", type = "xts")
And here’s some output.
> head(out) Open High Low Close Volume Dividend Split Adj_Open Adj_High Adj_Low Adj_Close Adj_Volume 1999-12-31 146.80 147.50 146.30 146.90 3172700 0 1 110.8666 111.3952 110.4890 110.9421 3172700 2000-01-03 148.25 148.25 143.88 145.44 8164300 0 1 111.9701 111.9701 108.6695 109.8477 8164300 2000-01-04 143.50 144.10 139.60 139.80 8089800 0 1 108.3828 108.8359 105.4372 105.5882 8089800 2000-01-05 139.90 141.20 137.30 140.80 12177900 0 1 105.6631 106.6449 103.6993 106.3428 12177900 2000-01-06 139.60 141.50 137.80 137.80 6227200 0 1 105.4327 106.8677 104.0733 104.0733 6227200 2000-01-07 140.30 145.80 140.10 145.80 8066500 0 1 105.9690 110.1231 105.8179 110.1231 8066500
To note, this data is not automatically assigned to “SPY” as quantmod’s “getSymbols” function fetching from Yahoo would automatically do. Also, note that when calling the Quandl function to its EOD database, you automatically obtain both adjusted and unadjusted prices. One aspect that I am not sure is as easily done through Quandl’s API is how easy it is to adjust prices for splits but not dividends. But, for what it’s worth, there it is. So for those that take contention with the quality of Yahoo data, you may wish to look at Quandl’s EOD database for $50/month.
So…that’s it. From this point on, this blog is an archive of my work that will stay up; it’s not going anywhere. However, I won’t be updating it or answering questions on this blog. For those that have any questions about functionality, I highly recommend posting questions to the R-SIG Finance mailing list. It’s been a pleasure sharing my thoughts and work with you, and I’m glad I’ve garnered the attention of many intelligent individuals, from those that have provided me with data, to those that have built upon my work, to those that have hired me for consulting (and now a full-time) opportunity. I also hope that some of my work displayed here made it to other trading and/or asset management firms. I am very grateful for all of the feedback, comments, data, and opportunities I’ve received along the way.
Once again, thank you so much for reading. It’s been a pleasure.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.