Pretty histograms with ggplot2

[This article was first published on blogR, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

@drsimonj here to make pretty histograms with ggplot2!

In this post you’ll learn how to create histograms like this:

init-example-1.jpg

 The data

Let’s simulate data for a continuous variable x in a data frame d:

set.seed(070510)
d <- data.frame(x = rnorm(2000))

head(d)
#>            x
#> 1  1.3681661
#> 2 -0.0452337
#> 3  0.0290572
#> 4 -0.8717429
#> 5  0.9565475
#> 6 -0.5521690

 Basic Histogram

Create the basic ggplot2 histogram via:

library(ggplot2)

ggplot(d, aes(x)) +
    geom_histogram()

basic-1.jpg

 Adding Colour

Time to jazz it up with colour! The method I’ll present was motivated by my answer to this StackOverflow question.

We can add colour by exploiting the way that ggplot2 stacks colour for different groups. Specifically, we fill the bars with the same variable (x) but cut into multiple categories:

ggplot(d, aes(x, fill = cut(x, 100))) +
    geom_histogram()

color1-1.jpg

What the…

Oh, ggplot2 has added a legend for each of the 100 groups created by cut! Get rid of this with show.legend = FALSE:

ggplot(d, aes(x, fill = cut(x, 100))) +
    geom_histogram(show.legend = FALSE)

color2-1.jpg

Not a bad starting point, but say we want to tweak the colours.

For a continuous colour gradient, a simple solution is to include scale_fill_discrete and play with the range of hues. To get your colours right, get familiar with the hue scale.

For example, here we’ll tweak the colours to range from blue to red:

ggplot(d, aes(x, fill = cut(x, 100))) +
  geom_histogram(show.legend = FALSE) +
  scale_fill_discrete(h = c(240, 10))

color3-1.jpg

Seems a little dark. Tweak chroma and luminance with c and l:

ggplot(d, aes(x, fill = cut(x, 100))) +
  geom_histogram(show.legend = FALSE) +
  scale_fill_discrete(h = c(240, 10), c = 120, l = 70)

color4-1.jpg

 Final touches

The final touches are to set the theme, add labels, and a title:

ggplot(d, aes(x, fill = cut(x, 100))) +
  geom_histogram(show.legend = FALSE) +
  scale_fill_discrete(h = c(240, 10), c = 120, l = 70) +
  theme_minimal() +
  labs(x = "Variable X", y = "n") +
  ggtitle("Histogram of X")

touches-1.jpg

Now have fun tweaking the colours!

p <- ggplot(d, aes(x, fill = cut(x, 100))) +
  geom_histogram(show.legend = FALSE) +
  theme_minimal() +
  labs(x = "Variable X", y = "n") +
  ggtitle("Histogram of X")

p + scale_fill_discrete(h = c(180, 360), c = 150, l = 80)

tweak-1.jpg

p + scale_fill_discrete(h = c(90, 210), c = 30, l = 50)

tweak-2.jpg

 Sign off

Thanks for reading and I hope this was useful for you.

For updates of recent blog posts, follow @drsimonj on Twitter, or email me at [email protected] to get in touch.

If you’d like the code that produced this blog, check out the blogR GitHub repository.

To leave a comment for the author, please follow the link and comment on their blog: blogR.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)