Much more efficient bubble sort in R using the Rcpp and inline packages

[This article was first published on NumberTheory » R stuff, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Recently I wrote a blogpost showing the implementation of a simple bubble sort algorithm in pure R code. The downside of that implementation was that is was awfully slow. And by slow, I mean really slow, as in “a 100 element vector takes 7 seconds to sort”-slow. One of the major opportunities for a speed is to start using a compiled language. I chose to use C++ as this is really easy to integrate into R using the Rcpp package. In addition to the Rcpp package I use the inline package which allows one to use C++ code and R code in a seamless fashion. The following code creates an R function bubble_sort_cpp:

require(inline)  ## for cxxfunction()                                                       
                                                                                            
src = 'Rcpp::NumericVector vec = Rcpp::NumericVector(vec_in);                               
       double tmp = 0;                                                                      
       int no_swaps;                                                                        
       while(true) {                                                                        
           no_swaps = 0;                                                                    
           for (int i = 0; i < vec.size()-1; ++i) {                                         
               if(vec[i] > vec[i+1]) {                                                      
                   no_swaps++;                                                              
                   tmp = vec[i];                                                            
                   vec[i] = vec[i+1];                                                       
                   vec[i+1] = tmp;                                                          
               };                                                                           
           };                                                                               
           if(no_swaps == 0) break;                                                         
       };                                                                                   
       return(vec);'                                                                        
bubble_sort_cpp = cxxfunction(signature(vec_in = "numeric"), body=src, plugin="Rcpp")

Quite amazing how easy it is to integrate R code and C++ code. inline compiles and links the C++ code on-the-fly, creating an R function that delivers the functionality. Of course the most important question is now how fast this is. I use the microbenchmark package to run the bubble sort I implemented in pure R (here), the bubble sort implemented in C++ (see above), and the standard R sorting algorithm:

library(microbenchmark) 
vector_size = 100                                                                           
print(microbenchmark(bubble_sort(sample(1:vector_size)),                                    
               bubble_sort_cpp(sample(1:vector_size)),                                      
               sort(sample(1:vector_size))))  
                                   expr       min         lq     median
     bubble_sort(sample(1:vector_size)) 67397.546 74358.9495 78143.0710
 bubble_sort_cpp(sample(1:vector_size))    44.895    55.9340    60.4930
            sort(sample(1:vector_size))    44.173    48.1315    62.3785
         uq        max neval
 81285.0215 105626.483   100 
    63.7715     74.643   100 
    67.2375    138.069   100

These results speak for itself, the C++ version is more than 1300 times faster when looking at the median speed, even faster than the built-in sort function. These differences will only get more pronounced when the size of the vector grows.

To leave a comment for the author, please follow the link and comment on their blog: NumberTheory » R stuff.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)