Bubble sort implemented in pure R

May 10, 2013
By

(This article was first published on NumberTheory » R stuff, and kindly contributed to R-bloggers)

Please note that this is programming I purely did for the learning experience. The pure R bubble sort implemented in this post is veeeeery slow for two reasons:

  1. Interpreted code with lots of iteration is very slow.
  2. Bubble sort is one of the slowest sorting algorithms (O(N^2))

The bubble sort sorting algorithm works by iterating over the unsorted vector and comparing pairs of numbers. Let’s say the first point pair is c(61, 3), here the numbers need to be swapped as the 3 should be earlier in the sorted vector. The following function returns TRUE if the numbers should be swapped, and returns FALSE otherwise:

larger = function(pair) {
   if(pair[1] > pair[2]) return(TRUE) else return(FALSE)
}

This function is used by the following function:

swap_if_larger = function(pair) {
    if(larger(pair)) {
        return(rev(pair)) 
    } else {
        return(pair)
    }
}

which returns the swapped version of the pair if appropriate, or the original pair if the order is ok. For each point pair (element1-element2, element2-element3, etc) swap_if_larger is called:

swap_pass = function(vec) { 
    for(i in seq(1, length(vec)-1)) {
        vec[i:(i+1)] = swap_if_larger(vec[i:(i+1)])
    }
    return(vec)
}

One pass of this function performs a comparison on all pairs, swapping if necessary. To fully sort the vector, we need to perform multiple passes until no swaps are needed anymore. I chose to implement this using recursion:

bubble_sort = function(vec) {
    new_vec = swap_pass(vec)
    if(isTRUE(all.equal(vec, new_vec))) { 
        return(new_vec) 
    } else {
        return(bubble_sort(new_vec))
    }
}

The function starts by perform a swapping pass over the vector. If the new vector is equal to the old vector, no swaps where needed, i.e. the vector is already sorted. The function than returns the vector. Alternatively, if the vectors are different, the vector is not yet fully sorted, and we need to perform more passes. This is accomplished by recursively calling bubble_sort again on the vector. An example of the function in action:

> test_vec = round(runif(100, 0, 100))
> bubble_sort(test_vec)
  [1]  1  1  6  6  9 10 10 10 13 14 14 15 19 19 20 21 23 24 24 24 26 26 26 26 27
 [26] 28 28 30 31 32 34 35 35 36 36 37 39 39 40 40 40 41 41 41 41 43 43 43 45 46
 [51] 47 51 56 56 57 57 57 58 58 59 61 61 62 63 64 65 68 68 69 70 71 71 72 73 74
 [76] 75 75 75 78 79 82 82 84 85 88 88 89 90 91 91 91 92 92 92 92 93 93 96 96 99
>

The full sorting process is nicely illustrated by the following animated gif (linked from wikipedia):

enter image description here

This implementation is horribly slow:

> system.time(bubble_sort(test_vec))
   user  system elapsed
  0.076   0.000   0.077
> system.time(sort(test_vec))
   user  system elapsed
  0.001   0.000   0.001

Probably implementing the relatively slow bubble sort in a compiled language pose a dramatic increase in speed. Maybe a nice first testcase for Rcpp

To leave a comment for the author, please follow the link and comment on his blog: NumberTheory » R stuff.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.