**NumberTheory » R stuff**, and kindly contributed to R-bloggers)

Please note that this is programming I purely did for the learning experience. The pure R bubble sort implemented in this post is veeeeery slow for two reasons:

- Interpreted code with lots of iteration is very slow.
- Bubble sort is one of the slowest sorting algorithms (
`O(N^2)`

)

The bubble sort sorting algorithm works by iterating over the unsorted vector and comparing pairs of numbers. Let’s say the first point pair is `c(61, 3)`

, here the numbers need to be swapped as the 3 should be earlier in the sorted vector. The following function returns `TRUE`

if the numbers should be swapped, and returns `FALSE`

otherwise:

larger = function(pair) { if(pair[1] > pair[2]) return(TRUE) else return(FALSE) }

This function is used by the following function:

swap_if_larger = function(pair) { if(larger(pair)) { return(rev(pair)) } else { return(pair) } }

which returns the swapped version of the pair if appropriate, or the original pair if the order is ok. For each point pair (element1-element2, element2-element3, etc) `swap_if_larger`

is called:

swap_pass = function(vec) { for(i in seq(1, length(vec)-1)) { vec[i:(i+1)] = swap_if_larger(vec[i:(i+1)]) } return(vec) }

One pass of this function performs a comparison on all pairs, swapping if necessary. To fully sort the vector, we need to perform multiple passes until no swaps are needed anymore. I chose to implement this using recursion:

bubble_sort = function(vec) { new_vec = swap_pass(vec) if(isTRUE(all.equal(vec, new_vec))) { return(new_vec) } else { return(bubble_sort(new_vec)) } }

The function starts by perform a swapping pass over the vector. If the new vector is equal to the old vector, no swaps where needed, i.e. the vector is already sorted. The function than returns the vector. Alternatively, if the vectors are different, the vector is not yet fully sorted, and we need to perform more passes. This is accomplished by recursively calling `bubble_sort`

again on the vector. An example of the function in action:

> test_vec = round(runif(100, 0, 100)) > bubble_sort(test_vec) [1] 1 1 6 6 9 10 10 10 13 14 14 15 19 19 20 21 23 24 24 24 26 26 26 26 27 [26] 28 28 30 31 32 34 35 35 36 36 37 39 39 40 40 40 41 41 41 41 43 43 43 45 46 [51] 47 51 56 56 57 57 57 58 58 59 61 61 62 63 64 65 68 68 69 70 71 71 72 73 74 [76] 75 75 75 78 79 82 82 84 85 88 88 89 90 91 91 91 92 92 92 92 93 93 96 96 99 >

The full sorting process is nicely illustrated by the following animated gif (linked from wikipedia):

This implementation is horribly slow:

> system.time(bubble_sort(test_vec)) user system elapsed 0.076 0.000 0.077 > system.time(sort(test_vec)) user system elapsed 0.001 0.000 0.001

Probably implementing the relatively slow bubble sort in a compiled language pose a dramatic increase in speed. Maybe a nice first testcase for `Rcpp`

…

**leave a comment**for the author, please follow the link and comment on their blog:

**NumberTheory » R stuff**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...