Nuts and Bolts of Quantstrat, Part I

September 8, 2014
By

(This article was first published on QuantStrat TradeR » R, and kindly contributed to R-bloggers)

Recently, I gave a webinar on some introductory quantstrat. Here’s the link.

So to follow up on it, I’m going to do a multi-week series of posts delving into trying to explain the details of parts of my demos, so as to be sure that everyone has a chance to learn and follow along with my methodologies, what I do, and so on. To keep things simple, I’ll be using the usual RSI 20/80 filtered on SMA 200 demo. This post will deal with the initial setup of any demo–code which will be largely similar from demo to demo.

Let’s examine this code:

require(IKTrading)
require(quantstrat)
require(PerformanceAnalytics)

initDate="1990-01-01"
from="2003-01-01"
to="2012-12-31"
options(width=70)

source("demoData.R")

The first three lines load the libraries I use in my demos. In R, libraries are loaded with a single line. However, installation procedures may vary from operating system to operating system. Windows systems are the least straightforward, while macs can use unix functionality to function in identical ways to linux machines. It’s often good practice to place functions used repeatedly into a package, which is R’s own version of encapsulation and information hiding. Packages don’t always have to be open-sourced to the internet, and in many cases, some are used just as local repositories. My IKTrading package started off as such a case; it’s simply a toolbox that contains functionality that isn’t thematically attributable in other places.

The next three lines, dealing with dates, all have separate purposes.

The initDate variable needs a date that must occur before the start of data in a backtest. If this isn’t the case, the portfolio will demonstrate a massive drawdown on the initialization date, and many of the backtest statistics will be misleading or nonsensical.

The from and to variables are endpoints on the data that the demoData.R script will use to fetch from yahoo (or elsewhere). The format is yyyy-mm-dd, which means four digit year, two digit month (E.G. January is “01”), and two digit day, in that order.

In some cases, I may write the code:

to=as.character(Sys.Date())

This just sets the current to date to the time that I run the demonstration. Although it may affect the replication of the results, thanks to some of the yearly metrics I’ve come to utilize, those wishing to see the exact day of the end of the data would be able to. However, in cases that I use data to the present, it’s often simply an exploration of the indicator as opposed to trying to construct a fully-fledged trading system.

The options(width=70) line simply controls the width of output to my R console.

The source line is a way to execute other files in the specified directory. Sourcing files works in similar ways to specifying a file path. So if, from your current directory, there’s a file you want to source called someFile, you may write a command such as source(“someFile.R”), but if said file is in a different directory, you would want to use the standard unix file navigation notation to execute it. For instance, if my directory was in “IKTrading”, rather than “IKTrading/demo”, I would write source(“demo/demoData.R”).

In order to obtain the data, let’s look at the demoData.R file once again.

options("getSymbols.warning4.0"=FALSE)
rm(list=ls(.blotter), envir=.blotter)

currency('USD')
Sys.setenv(TZ="UTC")

symbols <- c("XLB", #SPDR Materials sector
             "XLE", #SPDR Energy sector
             "XLF", #SPDR Financial sector
             "XLP", #SPDR Consumer staples sector
             "XLI", #SPDR Industrial sector
             "XLU", #SPDR Utilities sector
             "XLV", #SPDR Healthcare sector
             "XLK", #SPDR Tech sector
             "XLY", #SPDR Consumer discretionary sector
             "RWR", #SPDR Dow Jones REIT ETF
             
             "EWJ", #iShares Japan
             "EWG", #iShares Germany
             "EWU", #iShares UK
             "EWC", #iShares Canada
             "EWY", #iShares South Korea
             "EWA", #iShares Australia
             "EWH", #iShares Hong Kong
             "EWS", #iShares Singapore
             "IYZ", #iShares U.S. Telecom
             "EZU", #iShares MSCI EMU ETF
             "IYR", #iShares U.S. Real Estate
             "EWT", #iShares Taiwan
             "EWZ", #iShares Brazil
             "EFA", #iShares EAFE
             "IGE", #iShares North American Natural Resources
             "EPP", #iShares Pacific Ex Japan
             "LQD", #iShares Investment Grade Corporate Bonds
             "SHY", #iShares 1-3 year TBonds
             "IEF", #iShares 3-7 year TBonds
             "TLT" #iShares 20+ year Bonds
)

#SPDR ETFs first, iShares ETFs afterwards
if(!"XLB" %in% ls()) { 
  suppressMessages(getSymbols(symbols, from=from, to=to, src="yahoo", adjust=TRUE))  
}

stock(symbols, currency="USD", multiplier=1)

The first line is simply to remove the initial warning that comes from getting symbols from yahoo. It makes no difference to how the demo runs.
The next line clears the blotter environment. The blotter environment contains all portfolio and account objects. For instance, as my blotter environment is not currently cleared, when I type

ls(.blotter)

I get the results:

"account.DollarVsATRos"   "portfolio.DollarVsATRos"

Certainly, if you’re working with multiple portfolios at once in a live research environment, you may not want to wipe all of your previous work with every iteration of a new demo. This is what this line does.

The next two lines are critical.

currency('USD')
Sys.setenv(TZ="UTC")

Currency must be initialized for every demo. Thus far, I’ve yet to see it set to anything besides USD (U.S. Dollars), however, the accounting analytics back-end systems need to know what currency the prices are listed in. So the currency line cannot be skipped, or the demo will not work.

Next, the Sys.setenv(TZ=”UTC”) line is necessary because if you look at, say, the data of XLB, and look at the class of its index, here’s what you see:

> head(XLB)
           XLB.Open XLB.High  XLB.Low XLB.Close XLB.Volume XLB.Adjusted
2003-01-02 15.83335 16.09407 15.68323  16.08617  401095.00        15.58
2003-01-03 16.03877 16.05457 15.91235  15.99926   79105.20        15.50
2003-01-06 16.10988 16.41011 16.10988  16.30740  377806.43        15.80
2003-01-07 16.38641 16.38641 16.18098  16.25209  390463.27        15.75
2003-01-08 16.19679 16.19679 15.80964  15.83335  201496.76        15.34
2003-01-09 15.95186 16.12568 15.92026  16.07827   82522.54        15.58

class(index(XLB))

> class(index(XLB))
[1] "Date"

Since the index of the data is a Date type object, in order for certain orders to work, such as chain rules (which contain stop losses and take profits), the timezone has to be set as UTC, since that’s the time zone for a “Date” class object. If the demo uses the system’s default timezone instead, the timestamps will not match, and so, there will be order failures.

The symbols assignment is simply one long string vector. Here it is, once again:

symbols <- c("XLB", #SPDR Materials sector
             "XLE", #SPDR Energy sector
             "XLF", #SPDR Financial sector
             "XLP", #SPDR Consumer staples sector
             "XLI", #SPDR Industrial sector
             "XLU", #SPDR Utilities sector
             "XLV", #SPDR Healthcare sector
             "XLK", #SPDR Tech sector
             "XLY", #SPDR Consumer discretionary sector
             "RWR", #SPDR Dow Jones REIT ETF
             
             "EWJ", #iShares Japan
             "EWG", #iShares Germany
             "EWU", #iShares UK
             "EWC", #iShares Canada
             "EWY", #iShares South Korea
             "EWA", #iShares Australia
             "EWH", #iShares Hong Kong
             "EWS", #iShares Singapore
             "IYZ", #iShares U.S. Telecom
             "EZU", #iShares MSCI EMU ETF
             "IYR", #iShares U.S. Real Estate
             "EWT", #iShares Taiwan
             "EWZ", #iShares Brazil
             "EFA", #iShares EAFE
             "IGE", #iShares North American Natural Resources
             "EPP", #iShares Pacific Ex Japan
             "LQD", #iShares Investment Grade Corporate Bonds
             "SHY", #iShares 1-3 year TBonds
             "IEF", #iShares 3-7 year TBonds
             "TLT" #iShares 20+ year Bonds
)

There is nothing particularly unique about it. However, I structured the vector so as to be able to comment with the description of each ETF next to its ticker string for the purposes of clarity.

From there, the file gets the symbols from yahoo. The extra verbosity around the command is simply to suppress any output to the screen. Here’s the line of code that does this:

#SPDR ETFs first, iShares ETFs afterwards
if(!"XLB" %in% ls()) { 
  suppressMessages(getSymbols(symbols, from=from, to=to, src="yahoo", adjust=TRUE))  
}

I can control whether or not to rerun the data-gathering process by removing XLB from my current working environment. If I do not, then this line is skipped altogether, to speed up the backtest.

Lastly, the backtest needs the instrument specifications. This is the line of code to do so:

stock(symbols, currency="USD", multiplier=1)

Although it looks fairly trivial at the moment, once a backtest would start dealing with futures, contract multiplier specifications, and other instrument-specific properties, this line becomes far less trivial than it looks.

Moving back to the main scripts, here is the rest of the initialization boilerplate:

#trade sizing and initial equity settings
tradeSize <- 100000
initEq <- tradeSize*length(symbols)

strategy.st <- portfolio.st <- account.st <- "DollarVsATRos"
rm.strat(strategy.st)
initPortf(portfolio.st, symbols=symbols, initDate=initDate, currency='USD')
initAcct(account.st, portfolios=portfolio.st, initDate=initDate, currency='USD',initEq=initEq)
initOrders(portfolio.st, initDate=initDate)
strategy(strategy.st, store=TRUE)

The tradeSize and initEq variables are necessary in order to compute returns at the end of a backtest. Furthermore, tradeSize is necessary for the osDollarATR order-sizing function.

Next, I name the strategy, portfolio, and account–all with the same name. The x <- y <- z <- "xyz" format is a multi-assignment syntax that should be used only in extenuating circumstances, such as initializing multiple xts objects of the same length. This multi-assignment syntax also necessarily means that all three variables will be (at least) initially identical to one another.

Next, the removal of the strategy is necessary for rerunning the strategy. If the strategy object exists, and a user attempts to rerun the demo, the demo will crash. Always make sure to remove the strategy.

Next, we have the three initialization steps. As the account takes in the names of the pre-existing portfolios, and orders must be initialized after the portfolio, the order I choose to initialize is portfolio, account, and then orders.

To initialize the portfolio, one needs to name the portfolio something, have a vector of character strings that represent the symbols passed in as the symbols argument, an initial date (initDate), and a currency. This currency was defined earlier in the demoData.R file.

The account initialization replaces the symbols argument with a portfolios argument, and adds an initEq argument, from which to compute returns.

Lastly, the orders initialization needs only a portfolio to reference, and a date from which to begin transactions (initDate, which is earlier than the beginning of the data).

Lastly, we initialize the strategy in the form of the line:

strategy(strategy.st, store=TRUE)

This is where we put all our indicators, signals, and rules. Without supplying this line to the demo, none of the indicators, signals, and rules will know what strategy object to look for, and the demo will crash.

This concludes the initial boilerplate walkthrough. Next: parameters and indicators.

Thanks for reading.

To leave a comment for the author, please follow the link and comment on their blog: QuantStrat TradeR » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)