# Much more efficient bubble sort in R using the Rcpp and inline packages

May 14, 2013
By

[This article was first published on NumberTheory » R stuff, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Recently I wrote a blogpost showing the implementation of a simple bubble sort algorithm in pure R code. The downside of that implementation was that is was awfully slow. And by slow, I mean really slow, as in “a 100 element vector takes 7 seconds to sort”-slow. One of the major opportunities for a speed is to start using a compiled language. I chose to use C++ as this is really easy to integrate into R using the `Rcpp` package. In addition to the `Rcpp` package I use the `inline` package which allows one to use C++ code and R code in a seamless fashion. The following code creates an R function `bubble_sort_cpp`:

```require(inline)  ## for cxxfunction()

src = 'Rcpp::NumericVector vec = Rcpp::NumericVector(vec_in);
double tmp = 0;
int no_swaps;
while(true) {
no_swaps = 0;
for (int i = 0; i < vec.size()-1; ++i) {
if(vec[i] > vec[i+1]) {
no_swaps++;
tmp = vec[i];
vec[i] = vec[i+1];
vec[i+1] = tmp;
};
};
if(no_swaps == 0) break;
};
return(vec);'
bubble_sort_cpp = cxxfunction(signature(vec_in = "numeric"), body=src, plugin="Rcpp")```

Quite amazing how easy it is to integrate R code and C++ code. `inline` compiles and links the C++ code on-the-fly, creating an R function that delivers the functionality. Of course the most important question is now how fast this is. I use the `microbenchmark` package to run the bubble sort I implemented in pure R (here), the bubble sort implemented in C++ (see above), and the standard R sorting algorithm:

```library(microbenchmark)
vector_size = 100
print(microbenchmark(bubble_sort(sample(1:vector_size)),
bubble_sort_cpp(sample(1:vector_size)),
sort(sample(1:vector_size))))
expr       min         lq     median
bubble_sort(sample(1:vector_size)) 67397.546 74358.9495 78143.0710
bubble_sort_cpp(sample(1:vector_size))    44.895    55.9340    60.4930
sort(sample(1:vector_size))    44.173    48.1315    62.3785
uq        max neval
81285.0215 105626.483   100
63.7715     74.643   100
67.2375    138.069   100```

These results speak for itself, the C++ version is more than 1300 times faster when looking at the median speed, even faster than the built-in `sort` function. These differences will only get more pronounced when the size of the vector grows.

To leave a comment for the author, please follow the link and comment on their blog: NumberTheory » R stuff.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...