**Jkunst - R category**, and kindly contributed to R-bloggers)

Have you read Visualizing Algorithms by Mike Bostock? It's a *pure gold post*.

In that post Mike show a *static* representation of a sort algorith and obvious it will fun to replicate that image

with ggplot. So here we go.

We need some sorts algorihms. In this link you can

see some algorithms.

We start with Insertion Sort:

```
insertion_sort_steps <- function(x = sample(1:15)){
msteps <- matrix(data = x, ncol = length(x))
for (i in 2:length(x)) {
j <- i
while ((j > 1) && (x[j] < x[j - 1])) {
temp <- x[j]
x[j] <- x[j - 1]
x[j - 1] <- temp
j <- j - 1
msteps <- rbind(msteps, as.vector(x))
}
}
msteps
}
```

Now to test it and see what the function do:

```
set.seed(12345)
x <- sample(seq(4))
x
```

```
## [1] 3 4 2 1
```

```
msteps <- insertion_sort_steps(x)
as.data.frame(msteps)
```

V1 | V2 | V3 | V4 |
---|---|---|---|

3 | 4 | 2 | 1 |

3 | 2 | 4 | 1 |

2 | 3 | 4 | 1 |

2 | 3 | 1 | 4 |

2 | 1 | 3 | 4 |

1 | 2 | 3 | 4 |

Every *row* is a step in sort the algorithm (a partial sort). This matrix is a hard to plot so

we need a nicer structure. We can transform the matrix to a *data_frame*

with the information of every *position* of every *element* in each *step*.

```
sort_matix_to_df <- function(msteps){
df <- as.data.frame(msteps, row.names = NULL)
names(df) <- seq(ncol(msteps))
df_steps <- df %>%
tbl_df() %>%
mutate(step = seq(nrow(.))) %>%
gather(position, element, -step) %>%
arrange(step)
df_steps
}
```

And we apply this function to the previous *steps matrix*.

```
df_steps <- sort_matix_to_df(msteps)
head(df_steps, 10)
```

step | position | element |
---|---|---|

1 | 1 | 3 |

1 | 2 | 4 |

1 | 3 | 2 |

1 | 4 | 1 |

2 | 1 | 3 |

2 | 2 | 2 |

2 | 3 | 4 |

2 | 4 | 1 |

3 | 1 | 2 |

3 | 2 | 3 |

The next step will be plot the data frame.

```
plot_sort <- function(df_steps, size = 5, color.low = "#D1F0E1", color.high = "#524BB4"){
ggplot(df_steps,
aes(step, position, group = element, color = element, label = element)) +
geom_path(size = size, alpha = 1, lineend = "round") +
scale_colour_gradient(low = color.low, high = color.high) +
coord_flip() +
scale_x_reverse() +
theme(legend.position = "none")
}
```

Now compare this:

```
as.data.frame(msteps)
```

V1 | V2 | V3 | V4 |
---|---|---|---|

3 | 4 | 2 | 1 |

3 | 2 | 4 | 1 |

2 | 3 | 4 | 1 |

2 | 3 | 1 | 4 |

2 | 1 | 3 | 4 |

1 | 2 | 3 | 4 |

With:

```
plot_sort(df_steps, size = 6) + geom_text(color = "white", size = 4)
```

It works, so we can now scroll!

```
sample(seq(70)) %>%
insertion_sort_steps() %>%
sort_matix_to_df() %>%
plot_sort(size = 2.2)
```

Now try with other sort algorithms:

Bubble sort:

```
bubble_sort_steps <- function(x = sample(1:15)){
msteps <- matrix(data = x, ncol = length(x))
for (i in 1:(length(x) - 1)) {
for (j in 1:(length(x) - 1)) {
if (x[j] > x[j + 1]) {
temp <- x[j]
x[j] <- x[j + 1]
x[j + 1] <- temp
}
msteps <- rbind(msteps, as.vector(x))
}
}
msteps
}
```

Selection sort:

```
selection_sort_steps <- function(x = sample(1:15)){
msteps <- matrix(data = x, ncol = length(x))
for (i in 1:(length(x) - 1)) {
smallsub <- i
for (j in (i + 1):(length(x) - 0)) { # Is not '- 1' like website
if (x[j] < x[smallsub]) {
smallsub <- j
}
}
temp <- x[i]
x[i] <- x[smallsub]
x[smallsub] <- temp
msteps <- rbind(msteps, as.vector(x))
}
msteps
}
```

And test with a longer vector:

```
n <- 50
x <- sample(seq(n))
big_df <- rbind(
x %>% selection_sort_steps() %>% sort_matix_to_df() %>% mutate(sort = "Selection Sort"),
x %>% insertion_sort_steps() %>% sort_matix_to_df() %>% mutate(sort = "Insertion Sort"),
x %>% bubble_sort_steps() %>% sort_matix_to_df() %>% mutate(sort = "Bubble Sort")
)
head(big_df)
```

step | position | element | sort |
---|---|---|---|

1 | 1 | 3 | Selection Sort |

1 | 2 | 31 | Selection Sort |

1 | 3 | 47 | Selection Sort |

1 | 4 | 49 | Selection Sort |

1 | 5 | 24 | Selection Sort |

1 | 6 | 7 | Selection Sort |

```
big_df %>%
group_by(sort) %>%
summarise(steps = n())
```

sort | steps |
---|---|

Bubble Sort | 120100 |

Insertion Sort | 30700 |

Selection Sort | 2500 |

```
ggplot(big_df,
aes(step, position, group = element, color = element, label = element)) +
geom_path(size = 0.8, alpha = 1, lineend = "round") +
scale_colour_gradient(low = "#c21500", high = "#ffc500") + # http://uigradients.com/#Kyoto
facet_wrap(~sort, scales = "free_x", ncol = 1) +
theme(legend.position = "none",
strip.background = element_rect(fill = "transparent", linetype = 0),
strip.text = element_text(size = 8))
```

Or we can plot vertical using the viridis package:

```
ggplot(big_df,
aes(position, step, group = element, color = element, label = element)) +
geom_path(size = 1, alpha = 1, lineend = "round") +
scale_colour_gradientn(colours = viridis_pal()(n)) +
facet_wrap(~sort, scales = "free_y", nrow = 1) +
scale_y_reverse() +
theme(legend.position = "none",
strip.background = element_rect(fill = "transparent", linetype = 0),
strip.text = element_text(size = 8))
```

And that's it. If you write/implement another sort algorithm in this way let me know to view it ;).

Some bonus content :D.

References:

- http://bost.ocks.org/mike/algorithms/
- http://faculty.cs.niu.edu/~hutchins/csci230/sorting.htm
- http://corte.si/posts/code/visualisingsorting/
- http://uigradients.com/#Kyoto
- http://algs4.cs.princeton.edu/21elementary/

**leave a comment**for the author, please follow the link and comment on their blog:

**Jkunst - R category**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...