ggdist: Make a Raincloud Plot to Visualize Distribution in ggplot2
The ggdist
package is a ggplot2
extension that is made for visualizing distributions and uncertainty. We’ll show see how ggdist
can be used to make a raincloud plot.
What is a Raincloud Plot?
The Raincloud Plot is a visualization that produces a halfdensity to a distribution plot. It gets the name because the density plot is in the shape of a “raincloud”. The raincloud (halfdensity) plot enhances the traditional boxplot by highlighting multiple modalities (an indicator that groups may exist). The boxplot does not show where densities are clustered, but the raincloud plot does!
Raincloud Plot (We’ll make in this tutorial)
We’ll go through a short tutorial to get you up and running with ggdist
to make a raincloud plot.
Raincloud Plots with ggdist
[Tutorial]
This tutorial showcases the awesome power of ggdist
for visualizing distributions.
Tutorial Credits
This tutorial wouldn’t be possible without another tutorial, Visualizing Distributions with Raincloud Plots by Cédric Scherer. Cédric truly a ggplot2 master. Follow Cédric Scherer on Twitter to learn more about his excellent visualization work.
Onto the tutorial.
Load the Libraries and Data
First, run this code to:
 Load Libraries: Load
ggdist
,tidyquant
, andtidyverse
.  Import Data: We’re using the
mpg
dataset that comes withggplot2
.
Raincloud Plot: Using ggplot
Next, we’ll make a Raincloud plot that highlights the distribution of Vehicle Fuel Economy (MPG) by Engine Size (Number of Cylinders). It helps if you have ggplot2
visualization experience. If you are interested in learning ggplot2
indepth, check out our R for Business Analysis Course (DS4B 101R) that contains over 30hours of video lessons on learning R for data analysis.
Make the ggplot2 canvas
The first step is to make the ggplot2
canvas. We:

Prep the Data: Using
filter()
to isolate the most common (frequent) vehicle engine sizes 
Map the columns: Using
ggplot()
, we map the cyl and hwy column. We also make a transformation to convert a numeric cyl column to a discrete cyl column withfactor()
.
This produces a blank plot, which is the first layer. You can see that the xaxis is labeled “factor(cyl)” and the yaxis is “hwy” indicating the data has been mapped to the visualization.
Add the Rainclouds with stat_halfeye())
Next, we add our first geometry layer using ggdist::stat_halfeye()
. This produces a Half Eye visualization, which is contains a halfdensity and a slabinterval. We remove the slab interval by setting .width = 0
and point_colour = NA
. The halfdensity remains.
And here’s the output. We can see the halfdenisty distributions for fuel economy (hwy) by engine size (cyl).
Add the Boxplot with geom_boxplot()
Next, add the second geometry layer using ggplot2::geom_boxplot()
. This produces a narrow boxplot. We reduce the width
and adjust the opacity.
And here’s the output. We now have a boxplot and halfdensity. We can see how the distributions vary compared to the median and innerquartile range.
Add the Dot Plots with stat_dots()
Next, add the third geometry layer using ggdist::stat_dots()
. This produces a halfdotplot, which is similar to a histogram that indicates the number of samples (number of dots) in each bin. We select side = "left"
to indicate we want it on the lefthand side.
And here’s the output. We now have the three main geometries completed.
Making the plot look professional
We can clean up our plot with a professionallooking theme using tidyquant::theme_tq()
. We’ll also rotate it with coord_flip()
to give it the raincloud appearance.
We’ve just finalized our plot. We can see clearly that the distribution of the 6cylinder is bimodal, something you can’t tell with an ordinary boxplot. We should investigate why there are so many dots in 6cylinder with low highwayfuel economy. We’ll save that for another RTip.
Summary
We learned how to make Raincloud Plots with ggdist
. But, there’s a lot more to visualiztion.
It’s critical to learn how to visualize with ggplot2
, which is the premier framework for data visualization in R.
If you’d like to learn ggplot2
, data visualizations, and data science for business with R, then read on. ?
