Vectorized R vs Rcpp
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In my previous post, I tried to show, that Rcpp is 1000 faster than pure R and that generated the fuss in the comments. Being lazy, I didn’t vectorize R code and at the end I was comparing apples vs oranges.
To fix that problem, I built a new script, where I’m trying to compare apples against apples. First piece of code named “ifelse R” uses R “ifelse” function to vectorize code. Second piece of code is fully vectorized code written in R, third – pure C++ code and the last one is C++, where Rcpp ”ifelse” function is used.
| name | seconds |
|---|---|
| ifelse R | 27.50 |
| vectorized R | 10.40 |
| pure C++ | 0.44 |
| vectorized C++ | 2.24 |
Here we go – vectorization truly helps, but pure C++ code still 23 times faster. Of course you pay the price when writing it in C++.
I found a bit strange, that vectorized C++ code doesn’t perform that well…
You can get the code from github or review it below:
#Author Dzidorius Martinaitis
#Date 2012-02-01
#Description http://www.investuotojas.eu/2012/02/01/vectorized-r-vs-rcpp
bid = runif(50000000,5,9)
ask = runif(50000000,5,9)
close = runif(50000000,5,9)
x=data.frame(bid=bid,ask=ask,last_price=close)
rez=0
########### ifelse R #################
answ=as.vector(system.time(
{
rez = ifelse(x$last_price>0,ifelse(x[, "bid"] > x[, "last_price"], x[, "bid"], ifelse((x[, "ask"] > 0) & (x[, "ask"] < x[, "last_price"]), x[, "ask"], x[, "last_price"])), 0.5*(x[, "ask"] + x[,"bid"]))
})[1])
########### end ifelse R #################
########### vectorized R #################
answ=append(answ,system.time(
{
lgt0 = x$last_price > 0
bgtl = x$bid > x$last_price
agt0 = x$ask > 0
altl = x$ask > x$last_price
rez = x$last_price
rez[lgt0 & agt0 & altl] = x$ask[lgt0 & agt0 & altl]
rez[lgt0 & bgtl] = x$bid[lgt0 & bgtl]
rez[!lgt0] = (x$ask[!lgt0]+x$bid[!lgt0])/2
}
)[1])
########### end vectorized R #################
#C++ code starts here
library(inline)
library(Rcpp)
########### pure C++ #################
code='
NumericVector bid(bid_);NumericVector ask(ask_);NumericVector close(close_);NumericVector ret(ask_);
int bid_size = bid.size();
for(int i =0;i<bid_size;i++)
{
if(close[i]>0)
{
if(bid[i]>close[i])
{
ret[i] = bid[i];
}
else if(ask[i]>0 && ask[i]<close[i])
{
ret[i] = ask[i];//
}
else
{
ret[i] = close[i];//
}
}
else
{
ret[i]=(bid[i]+ask[i])/2;
}
}
return ret;
'
getLastPrice <- cxxfunction(signature( bid_ = "numeric",ask_ = "numeric",close_="numeric"),body=code,plugin="Rcpp")
rez=0
answ=append(answ,system.time(
{
rez=getLastPrice(as.numeric(x$bid),as.numeric(x$ask),as.numeric(x$last_price))
})[1])
########### end pure C++ #################
#summary(rez)
########### vectorized C++ #################
code='
NumericVector bid(bid_);NumericVector ask(ask_);NumericVector close(close_);NumericVector ret(ask_);
int bid_size = bid.size();
ret=ifelse(close>0,ifelse(bid >close, bid, ifelse(ask > 0,ifelse(ask < close,ask, close),close)), 0.5*(ask + bid));
return ret;
'
getLastPrice <- cxxfunction(signature( bid_ = "numeric",ask_ = "numeric",close_="numeric"),body=code,plugin="Rcpp")
rez=0
answ=append(answ,system.time(
{
rez=getLastPrice(as.numeric(x$bid),as.numeric(x$ask),as.numeric(x$last_price))
}
)[1])
########### end vectorized C++ #################
#summary(rez)
names(answ)=c('ifelse R','vectorized R','pure C++','vectorized C++')
library(ggplot2)
a=data.frame(ind=1:4,val=answ)
ggplot(a,aes(ind,val))+geom_point(legend=F)+geom_text(aes(label=names(answ),hjust=c(-0.2,-0.2,-0.2,0.8),vjust=c(0,0,0,-1)),size=4) |
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
