Taking the advice of David Robinson I’ve decided to start a blog and write about data science, not only to create a portfolio of my work, but as a repository I can check back on when I scratch my head and think “now how did I do that?”
A nice post I saw on twitter about how to reverse the order of a ggplot2 legend got me thinking about a graph I’d been making at work recently.
The standard way ggplot2 displays the axis on scatter plot for character class is not very intuitive.
# Load the data
data <- dplyr::starwars
# Subset data by species
starwars <- starwars %>%
filter(species == "Human")
# Count the number of films each character appears in
number_films <- sapply(starwars$films,length)
# Add the number of films to the dataset
starwars$number_films <- number_films # Add new column
scale_colour_gradient(low = "red", high = "blue")
It plots Z to A from the top-to-bottom.
Good data visualization should take into account how people process information. Gabriela Plucinska’s website (data in the spotlight) dedicated to improving data presentation says that “we are automatically trying to read information from top to bottom” so the fact that ggplot2 automatically plots this way is strange.
To plot A to Z from top-to-bottom one can convert the names to a factor and then set the levels to be the reverse of the original levels.
levels = rev(levels(factor(name)))),size=number_films)) +
scale_colour_gradient(low = "red", high = "blue") +
Alternatively, if the above code looks a bit messy you could make it more readable by reversing the levels of the factor outside of ggplot2 first:
starwars$name = forcats::fct_rev(factor(starwars$name))
Thanks to Mara Averick for tweeting the blog from which I got the inspiration to write my first blog post (and who’s always posting great content) and to #R4ds community members for helping me figure this out.