Co-integration and Pairs Trading

July 29, 2018
By

(This article was first published on S+/R – Yet Another Blog in Statistical Computing, and kindly contributed to R-bloggers)

The co-integration is an important statistical concept behind the statistical arbitrage strategy named “Pairs Trading”. While projecting a stock price with time series models is by all means difficult, it is technically feasible to find a pair of (or even a portfolio of) stocks sharing the common trend such that a linear combination of two series is stationary, which is so-called co-integration. The underlying logic of Pairs Trading is to monitor movements of co-integrated stocks and to look for trading opportunities when the divergence presents. Under the mean-reversion assumption, the stock price would tend to move back to the long-term equilibrium. As a result, the spread between two co-integrated stock prices would eventually converge. Furthermore, given the stationarity of the spread between co-integrated stocks, it becomes possible to forecast such spread with time series models.

Below shows a R utility function helping to identify pairwise co-integrations based upon the Johansen Test out of a arbitrary number of stock prices provided in a list of tickers.

For instance, based on a starting date on 2010/01/01 and a list of tickers for major US banks, we are able to identify 23 pairs of co-integrated stock prices out of 78 pairwise combinations. It is interesting to see that stock prices of two regional players, e.g. Fifth Third and M&T, are highly co-integrated, as visualized in the chart below.

pkgs <- list("quantmod", "doParallel", "foreach", "urca")
lapply(pkgs, require, character.only = T)
registerDoParallel(cores = 4)

jtest <- function(t1, t2) {
  start <- sd
  getSymbols(t1, from = start)
  getSymbols(t2, from = start)
  j <- summary(ca.jo(cbind(get(t1)[, 6], get(t2)[, 6])))
  r <- data.frame(stock1 = t1, stock2 = t2, stat = [email protected][2])
  r[, c("pct10", "pct5", "pct1")] <- [email protected][2, ]
  return(r)
}

pair <- function(lst) {
  d2 <- data.frame(t(combn(lst, 2)))
  stat <- foreach(i = 1:nrow(d2), .combine = rbind) %dopar% jtest(as.character(d2[i, 1]), as.character(d2[i, 2]))
  stat <- stat[order(-stat$stat), ]
  # THE PIECE GENERATING * CAN'T BE DISPLAYED PROPERLY IN WORDPRESS 
  rownames(stat) <- NULL
  return(stat)
}

sd <- "2010-01-01"
tickers <- c("FITB", "BBT", "MTB", "STI", "PNC", "HBAN", "CMA", "USB", "KEY", "JPM", "C", "BAC", "WFC")
pair(tickers)

   stock1 stock2      stat pct10 pct5  pct1 coint
1     STI    JPM 27.207462 12.91 14.9 19.19  ***
2    FITB    MTB 21.514142 12.91 14.9 19.19  ***
3     MTB    KEY 20.760885 12.91 14.9 19.19  ***
4    HBAN    KEY 19.247719 12.91 14.9 19.19  ***
5       C    BAC 18.573168 12.91 14.9 19.19   **
6    HBAN    JPM 18.019051 12.91 14.9 19.19   **
7    FITB    BAC 17.490536 12.91 14.9 19.19   **
8     PNC   HBAN 16.959451 12.91 14.9 19.19   **
9    FITB    BBT 16.727097 12.91 14.9 19.19   **
10    MTB   HBAN 15.852456 12.91 14.9 19.19   **
11    PNC    JPM 15.822610 12.91 14.9 19.19   **
12    CMA    BAC 15.685086 12.91 14.9 19.19   **
13   HBAN    BAC 15.446149 12.91 14.9 19.19   **
14    BBT    MTB 15.256334 12.91 14.9 19.19   **
15    MTB    JPM 15.178646 12.91 14.9 19.19   **
16    BBT   HBAN 14.808770 12.91 14.9 19.19    *
17    KEY    BAC 14.576440 12.91 14.9 19.19    *
18   FITB    JPM 14.272424 12.91 14.9 19.19    *
19    STI    BAC 14.253971 12.91 14.9 19.19    *
20   FITB    PNC 14.215647 12.91 14.9 19.19    *
21    MTB    BAC 13.891615 12.91 14.9 19.19    *
22    MTB    PNC 13.668863 12.91 14.9 19.19    *
23    KEY    JPM 12.952239 12.91 14.9 19.19    *

Screenshot from 2018-07-29 16-55-27

Screenshot from 2018-07-29 15-09-11

To leave a comment for the author, please follow the link and comment on their blog: S+/R – Yet Another Blog in Statistical Computing.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)