# Scatterplot with marginal boxplots

February 3, 2013
By

(This article was first published on Sandy Muspratt's R Blog, and kindly contributed to R-bloggers)

### Using R and ggplot2 to draw a scatterplot with the two marginal boxplots

Drawing a scatterplot with the marginal boxplots (or marginal histograms or marginal density plots) has always been a bit tricky (well for me anyway). The approach I take here is, first, to draw the three separate plots using ggplot2:

• the scatterplot;
• the horizontal boxplot to appear in the top margin;
• the vertical scatterplot to appear in the right margin;

then second, to set widths and heights of the spaces used for axis and tick mark labels, and to combine the three plots using functions from the gtable package. The difficulty has been to ensure that the tick mark labels in the scatterplot panel and in the top marginal boxplot panel take up the same space. Functions from the gtable package make this a reasonably straightforward process.

To draw the following chart, I borrowed and modified code from here and here. The final code and data are available on GitHub.

### Drawing the plot

This example uses the `mtcars` dataframe, available in base R. For convenience, the file mtcars marginal boxplots.R on GitHub contains all the code. First, load the ggplot2 and gtable packages and the mtcars dataframe.

``library(ggplot2)library(gtable)data(mtcars)``

#### Draw the scatterplot.

The plot margins are adjusted so that the spaces between the panels are reduced. Also, there is an ever-so-slight mismatch of the gridlines across the panels. The way to fix it is to remove the offset on each axis (`expand=c(0,0)`), then select an offset of your choice (`expand_limits(...)`). There are similar adjustments made to the marginal plots.

``p1 <- ggplot(mtcars, aes(mpg, hp)) +    geom_point() +    scale_x_continuous(expand = c(0, 0)) +    scale_y_continuous(expand = c(0, 0)) +    expand_limits(y = c(min(mtcars\$hp) - 0.1 * diff(range(mtcars\$hp)),       max(mtcars\$hp) + 0.1 * diff(range(mtcars\$hp)))) +    expand_limits(x = c(min(mtcars\$mpg) - 0.1 * diff(range(mtcars\$mpg)),       max(mtcars\$mpg) + 0.1 * diff(range(mtcars\$mpg)))) +    theme(plot.margin = unit(c(0.2, 0.2, 0.5, 0.5), "lines"))``

#### Draw the marginal boxplots

Note that the margins and axis offsets are adjusted to match those in the scatterplot. Also, the tick mark labels and axis titles for the x-axis and the y-axis are removed.

``# Horizontal marginal boxplot - to appear at the top of the chartp2 <- ggplot(mtcars, aes(x = factor(1), y = mpg)) +    geom_boxplot(outlier.colour = NA) +    geom_jitter(position = position_jitter(width = 0.05)) +    scale_y_continuous(expand = c(0, 0)) +    expand_limits(y = c(min(mtcars\$mpg) - 0.1 * diff(range(mtcars\$mpg)),       max(mtcars\$mpg) + 0.1 * diff(range(mtcars\$mpg)))) +    coord_flip() +    theme(axis.text = element_blank(),    axis.title = element_blank(),    axis.ticks = element_blank(),    plot.margin = unit(c(1, 0.2, -0.5, 0.5), "lines"))# Vertical marginal boxplot - to appear at the right of the chartp3 <- ggplot(mtcars, aes(x = factor(1), y = hp)) +    geom_boxplot(outlier.colour = NA) +     geom_jitter(position = position_jitter(width = 0.05)) +    scale_y_continuous(expand = c(0, 0)) +    expand_limits(y = c(min(mtcars\$hp) - 0.1 * diff(range(mtcars\$hp)),       max(mtcars\$hp) + 0.1 * diff(range(mtcars\$hp)))) +    theme(axis.text = element_blank(),    axis.title = element_blank(),    axis.ticks = element_blank(),    plot.margin = unit(c(0.2, 1, 0.5, -0.5), "lines"))``

#### Get the gtables for the three plots

``gt1 <- ggplot_gtable(ggplot_build(p1))gt2 <- ggplot_gtable(ggplot_build(p2))gt3 <- ggplot_gtable(ggplot_build(p3))``

#### Set the maximum widths and heights for x-axis and y-axis titles and text

The gtables store information required to draw the plots, including the widths of the spaces occupied by the y-axis titles and tick mark labels. The code gets the maximum widths of these spaces for the scatterplot and the horizontal marginal boxplot (gt1 and gt2), then sets that maximum as the width in the two gtables. So that there are no problems with the vertical alignment of the scatterplot and the vertical marginal boxplot, the heights are similarly set for gt1 and gt3.

``# Get maximum widths and heightsmaxWidth <- unit.pmax(gt1\$widths[2:3], gt2\$widths[2:3])maxHeight <- unit.pmax(gt1\$heights[4:5], gt3\$heights[4:5])# Set the maximums in the gtables for gt1, gt2 and gt3gt1\$widths[2:3] <- as.list(maxWidth)gt2\$widths[2:3] <- as.list(maxWidth)gt1\$heights[4:5] <- as.list(maxHeight)gt3\$heights[4:5] <- as.list(maxHeight)``

#### Combine the scatterplot with the two marginal boxplots

The following code creates a new gtable (gt), inserts the modified gt1, gt2 and gt3 into the new gtable, then renders the plot according to the information stored in the new gtable. Finally, a box is drawn around the combined plot.

``# Create a new gtablegt <- gtable(widths = unit(c(7, 1), "null"), height = unit(c(1, 7), "null"))# Instert gt1, gt2 and gt3 into the new gtablegt <- gtable_add_grob(gt, gt1, 2, 1)gt <- gtable_add_grob(gt, gt2, 1, 1)gt <- gtable_add_grob(gt, gt3, 2, 2)# And render the plotgrid.newpage()grid.draw(gt)grid.rect(x = 0.5, y = 0.5, height = 0.995, width = 0.995, default.units = "npc",     gp = gpar(col = "black", fill = NA, lwd = 1))``

Similar logic applies to the drawing of marginal density plots. The code shown below is also available in the file mtcars marginal density plots.R on GitHub.

``# Main scatterplotp1 <- ggplot(mtcars, aes(mpg, hp)) +    geom_point() +    scale_x_continuous(expand = c(0, 0)) +    scale_y_continuous(expand = c(0, 0)) +    expand_limits(y = c(min(mtcars\$hp) - 0.1 * diff(range(mtcars\$hp)),       max(mtcars\$hp) + 0.1 * diff(range(mtcars\$hp)))) +    expand_limits(x = c(min(mtcars\$mpg) - 0.1 * diff(range(mtcars\$mpg)),       max(mtcars\$mpg) + 0.1 * diff(range(mtcars\$mpg)))) +    theme(plot.margin = unit(c(0.2, 0.2, 0.5, 0.5), "lines"))# Horizontal marginal density plot - to appear at the top of the chartp2 <- ggplot(mtcars, aes(x = mpg)) +    geom_density() +    scale_x_continuous(expand = c(0, 0)) +    expand_limits(x = c(min(mtcars\$mpg) - 0.1 * diff(range(mtcars\$mpg)),     max(mtcars\$mpg) + 0.1 * diff(range(mtcars\$mpg)))) +    theme(axis.text = element_blank(),    axis.title = element_blank(),    axis.ticks = element_blank(),    plot.margin = unit(c(1, 0.2, -0.5, 0.5), "lines"))# Vertical marginal density plot - to appear at the right of the chartp3 <- ggplot(mtcars, aes(x = hp)) +    geom_density() +    scale_x_continuous(expand = c(0, 0)) +    expand_limits(x = c(min(mtcars\$hp) - 0.1 * diff(range(mtcars\$hp)),     max(mtcars\$hp) + 0.1 * diff(range(mtcars\$hp)))) +    coord_flip() +    theme(axis.text = element_blank(),    axis.title = element_blank(),    axis.ticks = element_blank(),    plot.margin = unit(c(0.2, 1, 0.5, -0.5), "lines"))# Get the gtablesgt1 <- ggplot_gtable(ggplot_build(p1))gt2 <- ggplot_gtable(ggplot_build(p2))gt3 <- ggplot_gtable(ggplot_build(p3))# Get maximum widths and heights for x-axis and y-axis title and textmaxWidth <- unit.pmax(gt1\$widths[2:3], gt2\$widths[2:3])maxHeight <- unit.pmax(gt1\$heights[4:5], gt3\$heights[4:5])# Set the maximums in the gtables for gt1, gt2 and gt3gt1\$widths[2:3] <- as.list(maxWidth)gt2\$widths[2:3] <- as.list(maxWidth)gt1\$heights[4:5] <- as.list(maxHeight)gt3\$heights[4:5] <- as.list(maxHeight)# Combine the scatterplot with the two marginal boxplots# Create a new gtablegt <- gtable(widths = unit(c(7, 2), "null"), height = unit(c(2, 7), "null"))# Instert gt1, gt2 and gt3 into the new gtablegt <- gtable_add_grob(gt, gt1, 2, 1)gt <- gtable_add_grob(gt, gt2, 1, 1)gt <- gtable_add_grob(gt, gt3, 2, 2)# And render the plotgrid.newpage()grid.draw(gt)grid.rect(x = 0.5, y = 0.5, height = 0.995, width = 0.995, default.units = "npc",     gp = gpar(col = "black", fill = NA, lwd = 1))``

To leave a comment for the author, please follow the link and comment on their blog: Sandy Muspratt's R Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

# Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts.(You will not see this message again.)

Click here to close (This popup will not appear again)