Faceted Graphs with cdata and ggplot2

October 21, 2018

(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers)

In between client work, John and I have been busy working on our book, Practical Data Science with R, 2nd Edition. To demonstrate a toy example for the section I’m working on, I needed scatter plots of the petal and sepal dimensions of the iris data, like so:

Unnamed chunk 1 1

I wanted a plot for petal dimensions and sepal dimensions, but I also felt that two plots took up too much space. So, I thought, why not make a faceted graph that shows both:

Unnamed chunk 2 1

Except — which columns do I plot and what do I facet on?

##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1          5.1         3.5          1.4         0.2  setosa
## 2          4.9         3.0          1.4         0.2  setosa
## 3          4.7         3.2          1.3         0.2  setosa
## 4          4.6         3.1          1.5         0.2  setosa
## 5          5.0         3.6          1.4         0.2  setosa
## 6          5.4         3.9          1.7         0.4  setosa

Here’s one way to create the plot I want, using the cdata package along with ggplot2.

First, load the packages and data:


iris <- data.frame(iris)

Now define the data-shaping transform, or control table. The control table is basically a picture that sketches out the final data shape that I want. I want to specify the x and y columns of the plot (call these the value columns of the data frame) and the column that I am faceting by (call this the key column of the data frame). And I also need to specify how the key and value columns relate to the existing columns of the original data frame.

Here’s what the control table looks like:

The control table specifies that the new data frame will have the columns flower_part, Length and Width. Every row of iris will produce two rows in the new data frame: one with a flower_part value of Petal, and another with a flower_part value of Sepal. The Petal row will take the Petal.Length and Petal.Width values in the Length and Width columns respectively. Similarly for the Sepal row.

Here I create the control table in R, using the convenience function wrapr::build_frame() to create the controlTable data frame in a legible way.

(controlTable <- wrapr::build_frame(
   "flower_part", "Length"      , "Width"       |
   "Petal"      , "Petal.Length", "Petal.Width" |
   "Sepal"      , "Sepal.Length", "Sepal.Width" ))
##   flower_part       Length       Width
## 1       Petal Petal.Length Petal.Width
## 2       Sepal Sepal.Length Sepal.Width

Now I apply the transform to iris using the function rowrecs_to_blocks(). I also want to carry along the Species column so I can color the scatterplot points by species.

iris_aug <- rowrecs_to_blocks(
  columnsToCopy = c("Species"))

##   Species flower_part Length Width
## 1  setosa       Petal    1.4   0.2
## 2  setosa       Sepal    5.1   3.5
## 3  setosa       Petal    1.4   0.2
## 4  setosa       Sepal    4.9   3.0
## 5  setosa       Petal    1.3   0.2
## 6  setosa       Sepal    4.7   3.2

And now I can create the plot!

ggplot(iris_aug, aes(x=Length, y=Width)) +
  geom_point(aes(color=Species, shape=Species)) + 
  facet_wrap(~flower_part, labeller = label_both, scale = "free") +
  ggtitle("Iris dimensions") +  
  scale_color_brewer(palette = "Dark2")
Unnamed chunk 7 1

In the next post, I will show how to use cdata and ggplot2 to create a scatterplot matrix.

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)