Accelerating R code: Computing Implied Volatilities Orders of Magnitude Faster

October 25, 2012
By

(This article was first published on Thinking inside the box , and kindly contributed to R-bloggers)

This blog, together with Romain’s, is one of the main homes of stories about how
Rcpp can help with getting code to run faster in the context of the
R system for statistical programming and analysis. By making it easier to get already existing C or C++ code
to R, or equally to extend R with new C++ code, Rcpp can help in getting stuff done.
And it is often fairly straightforward to do so.

In this context, I have a nice new example. And for once, it is work-related. I generally cannot share too much of what we do there as this is, well,
proprietary, but I have this nice new example. The other day, I was constructing (large) time series of implied volatilities. Implied volatilities
can be thought of as the complement to an option’s price: given a price (and all other observables which can be thought of as fixed), we compute an
implied volatility price (typically via the standard Black-Scholes model).
Given a changed implied volatility, we infer a new price — see this
for more details. In essence, it opens the door to all sorts of arbitrage and relative value pricing adventures.

Now, we observe prices fairly frequently to create somewhat sizeable time series of option prices. And each price corresponds to one matching implied
volatility, and for each such price we have to solve a small and straightforward optimization problem: to compute the implied volatility given the
price. This is usually done with an iterative root finder.

The problem comes from the fact that we have to do this (i) over and over and over for large data sets, and (ii) that there are a number of callbacks
from the (generic) solver to the (standard) option pricer.

So our first approach was to just call the corresponding function GBSVolatility from the
fOption package from the trusted
Rmetrics project by Diethelm Wuertz et al. This worked fine, but even with the usual tricks of splitting over
multiple cores/machines, it simply took too long for the resolution and data amount we desired. One of the problems is that this function (which
uses the proper uniroot optimizer in R) is not inefficient per se, but simply makes to many function call back to the option pricer as
can be seen from a quick glance at the code. The helper function .fGBSVolatility gets called time and time again:

R> GBSVolatility
function (price, TypeFlag = c("c", "p"), S, X, Time, r, b, tol = .Machine\$double.eps,
maxiter = 10000)
{
TypeFlag = TypeFlag
volatility = uniroot(.fGBSVolatility, interval = c(-10, 10),
price = price, TypeFlag = TypeFlag, S = S, X = X, Time = Time,
r = r, b = b, tol = tol, maxiter = maxiter)\$root
volatility
}
<environment: namespace:fOptions>
R>
R> .fGBSVolatility
function (x, price, TypeFlag, S, X, Time, r, b, ...)
{
GBS = GBSOption(TypeFlag = TypeFlag, S = S, X = X, Time = Time,
r = r, b = b, sigma = x)@price
price - GBS
}
<environment: namespace:fOptions>

So the next idea was to try the corresponding function from my RQuantLib package which
brings (parts of) QuantLib to R. That was seen as been lots faster already. Now, QuantLib is pretty big and so
is RQuantLib, and we felt it may not make sense to install it on a number of machines just for this simple problem. So one evening this week I
noodled around for an hour or two and combined (i) a basic Black/Scholes calculation and (ii) a standard univariate zero finder (both of which can be
found or described in numerous places) to minimize the difference between the observed price and the price given an implied volatility. With about
one hundred lines in C++, I had something which felt fast enough. So today I hooked this into R via a two-line wrapper in quickly-created package using
Rcpp.

I had one more advantage here. For our time series problem, the majority of the parameters (strike, time to maturity, rate, …) are fixed, so we can
structure the problem to be vectorised right from the start. I cannot share the code or more the details of my new implementation. However, both
GBSVolatility and EuropeanOprionImpliedVolatility are on CRAN (and as I happen to maintain these for
Debian, also just one sudo apt-get install r-cran-foptions r-cran-rquantlib away if you’re
on Debian or Ubuntu). And writing the other solver is really not that
involved.

Anyway, here is the result, courtesy of a quick run via the rbenchmark package. We create a vector of length
500; the implied volatility computation will be performed at each point (and yes, our time series are much longer indeed). This is replicated 100
times (as is the default for rbenchmark) for each of the three approaches:

[email protected]:~\$ r xxxxR/packages/xxxxOptions/demo/timing.R
test replications elapsed  relative user.self sys.self user.child sys.child
3 zzz(X)          100   0.038     1.000     0.040    0.000          0         0
2 RQL(X)          100   3.657    96.237     3.596    0.060          0         0
1 fOp(X)          100 448.060 11791.053   446.644    1.436          0         0
[email protected]:~\$

The new local solution is denoted by zzz(X). It is already orders of magnitude faster than the RQL(x) function using
RQuantLib (which is, I presume, due to my custom solution internalising the loop). And
the new approach is a laughable amount faster than the basic approach (shown as fOp) via
fOptions. For one hundred replications of solving implied volatilities for
all elements of a vector of size 500, the slow solution takes about 7.5 minutes — while the fast solution takes 38 milliseconds. Which comes to a
relative gain of over 11,000.

So sitting down with your C++ compiler to craft a quick one-hundred lines, combining two well-known and tested methods, can reap sizeable benefits. And Rcpp makes it trivial to call this from R.

To leave a comment for the author, please follow the link and comment on their blog: Thinking inside the box .

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...