**Sandy Muspratt's R Blog**, and kindly contributed to R-bloggers)

### Using R and ggplot2 to draw a scatterplot with the two marginal boxplots

Drawing a scatterplot with the marginal boxplots (or marginal histograms or marginal density plots) has always been a bit tricky (well for me anyway). The approach I take here is, first, to draw the three separate plots using ggplot2:

- the scatterplot;
- the horizontal boxplot to appear in the top margin;
- the vertical scatterplot to appear in the right margin;

then second, to set widths and heights of the spaces used for axis and tick mark labels, and to combine the three plots using functions from the gtable package. The difficulty has been to ensure that the tick mark labels in the scatterplot panel and in the top marginal boxplot panel take up the same space. Functions from the gtable package make this a reasonably straightforward process.

To draw the following chart, I borrowed and modified code from here and here. The final code and data are available on GitHub.

### Drawing the plot

This example uses the `mtcars`

dataframe, available in base R. For convenience, the file mtcars marginal boxplots.R on GitHub contains all the code. First, load the ggplot2 and gtable packages and the mtcars dataframe.

`library(ggplot2)`

library(gtable)

data(mtcars)

#### Draw the scatterplot.

The plot margins are adjusted so that the spaces between the panels are reduced. Also, there is an ever-so-slight mismatch of the gridlines across the panels. The way to fix it is to remove the offset on each axis (`expand=c(0,0)`

), then select an offset of your choice (`expand_limits(...)`

). There are similar adjustments made to the marginal plots.

`p1 <- ggplot(mtcars, aes(mpg, hp)) + `

geom_point() +

scale_x_continuous(expand = c(0, 0)) +

scale_y_continuous(expand = c(0, 0)) +

expand_limits(y = c(min(mtcars$hp) - 0.1 * diff(range(mtcars$hp)),

max(mtcars$hp) + 0.1 * diff(range(mtcars$hp)))) +

expand_limits(x = c(min(mtcars$mpg) - 0.1 * diff(range(mtcars$mpg)),

max(mtcars$mpg) + 0.1 * diff(range(mtcars$mpg)))) +

theme(plot.margin = unit(c(0.2, 0.2, 0.5, 0.5), "lines"))

#### Draw the marginal boxplots

Note that the margins and axis offsets are adjusted to match those in the scatterplot. Also, the tick mark labels and axis titles for the x-axis and the y-axis are removed.

`# Horizontal marginal boxplot - to appear at the top of the chart`

p2 <- ggplot(mtcars, aes(x = factor(1), y = mpg)) +

geom_boxplot(outlier.colour = NA) +

geom_jitter(position = position_jitter(width = 0.05)) +

scale_y_continuous(expand = c(0, 0)) +

expand_limits(y = c(min(mtcars$mpg) - 0.1 * diff(range(mtcars$mpg)),

max(mtcars$mpg) + 0.1 * diff(range(mtcars$mpg)))) +

coord_flip() +

theme(axis.text = element_blank(),

axis.title = element_blank(),

axis.ticks = element_blank(),

plot.margin = unit(c(1, 0.2, -0.5, 0.5), "lines"))

# Vertical marginal boxplot - to appear at the right of the chart

p3 <- ggplot(mtcars, aes(x = factor(1), y = hp)) +

geom_boxplot(outlier.colour = NA) +

geom_jitter(position = position_jitter(width = 0.05)) +

scale_y_continuous(expand = c(0, 0)) +

expand_limits(y = c(min(mtcars$hp) - 0.1 * diff(range(mtcars$hp)),

max(mtcars$hp) + 0.1 * diff(range(mtcars$hp)))) +

theme(axis.text = element_blank(),

axis.title = element_blank(),

axis.ticks = element_blank(),

plot.margin = unit(c(0.2, 1, 0.5, -0.5), "lines"))

#### Get the gtables for the three plots

`gt1 <- ggplot_gtable(ggplot_build(p1))`

gt2 <- ggplot_gtable(ggplot_build(p2))

gt3 <- ggplot_gtable(ggplot_build(p3))

#### Set the maximum widths and heights for x-axis and y-axis titles and text

The gtables store information required to draw the plots, including the widths of the spaces occupied by the y-axis titles and tick mark labels. The code gets the maximum widths of these spaces for the scatterplot and the horizontal marginal boxplot (gt1 and gt2), then sets that maximum as the width in the two gtables. So that there are no problems with the vertical alignment of the scatterplot and the vertical marginal boxplot, the heights are similarly set for gt1 and gt3.

`# Get maximum widths and heights`

maxWidth <- unit.pmax(gt1$widths[2:3], gt2$widths[2:3])

maxHeight <- unit.pmax(gt1$heights[4:5], gt3$heights[4:5])

# Set the maximums in the gtables for gt1, gt2 and gt3

gt1$widths[2:3] <- as.list(maxWidth)

gt2$widths[2:3] <- as.list(maxWidth)

gt1$heights[4:5] <- as.list(maxHeight)

gt3$heights[4:5] <- as.list(maxHeight)

#### Combine the scatterplot with the two marginal boxplots

The following code creates a new gtable (gt), inserts the modified gt1, gt2 and gt3 into the new gtable, then renders the plot according to the information stored in the new gtable. Finally, a box is drawn around the combined plot.

`# Create a new gtable`

gt <- gtable(widths = unit(c(7, 1), "null"), height = unit(c(1, 7), "null"))

# Instert gt1, gt2 and gt3 into the new gtable

gt <- gtable_add_grob(gt, gt1, 2, 1)

gt <- gtable_add_grob(gt, gt2, 1, 1)

gt <- gtable_add_grob(gt, gt3, 2, 2)

# And render the plot

grid.newpage()

grid.draw(gt)

grid.rect(x = 0.5, y = 0.5, height = 0.995, width = 0.995, default.units = "npc",

gp = gpar(col = "black", fill = NA, lwd = 1))

Similar logic applies to the drawing of marginal density plots. The code shown below is also available in the file mtcars marginal density plots.R on GitHub.

`# Main scatterplot`

p1 <- ggplot(mtcars, aes(mpg, hp)) +

geom_point() +

scale_x_continuous(expand = c(0, 0)) +

scale_y_continuous(expand = c(0, 0)) +

expand_limits(y = c(min(mtcars$hp) - 0.1 * diff(range(mtcars$hp)),

max(mtcars$hp) + 0.1 * diff(range(mtcars$hp)))) +

expand_limits(x = c(min(mtcars$mpg) - 0.1 * diff(range(mtcars$mpg)),

max(mtcars$mpg) + 0.1 * diff(range(mtcars$mpg)))) +

theme(plot.margin = unit(c(0.2, 0.2, 0.5, 0.5), "lines"))

# Horizontal marginal density plot - to appear at the top of the chart

p2 <- ggplot(mtcars, aes(x = mpg)) +

geom_density() +

scale_x_continuous(expand = c(0, 0)) +

expand_limits(x = c(min(mtcars$mpg) - 0.1 * diff(range(mtcars$mpg)),

max(mtcars$mpg) + 0.1 * diff(range(mtcars$mpg)))) +

theme(axis.text = element_blank(),

axis.title = element_blank(),

axis.ticks = element_blank(),

plot.margin = unit(c(1, 0.2, -0.5, 0.5), "lines"))

# Vertical marginal density plot - to appear at the right of the chart

p3 <- ggplot(mtcars, aes(x = hp)) +

geom_density() +

scale_x_continuous(expand = c(0, 0)) +

expand_limits(x = c(min(mtcars$hp) - 0.1 * diff(range(mtcars$hp)),

max(mtcars$hp) + 0.1 * diff(range(mtcars$hp)))) +

coord_flip() +

theme(axis.text = element_blank(),

axis.title = element_blank(),

axis.ticks = element_blank(),

plot.margin = unit(c(0.2, 1, 0.5, -0.5), "lines"))

# Get the gtables

gt1 <- ggplot_gtable(ggplot_build(p1))

gt2 <- ggplot_gtable(ggplot_build(p2))

gt3 <- ggplot_gtable(ggplot_build(p3))

# Get maximum widths and heights for x-axis and y-axis title and text

maxWidth <- unit.pmax(gt1$widths[2:3], gt2$widths[2:3])

maxHeight <- unit.pmax(gt1$heights[4:5], gt3$heights[4:5])

# Set the maximums in the gtables for gt1, gt2 and gt3

gt1$widths[2:3] <- as.list(maxWidth)

gt2$widths[2:3] <- as.list(maxWidth)

gt1$heights[4:5] <- as.list(maxHeight)

gt3$heights[4:5] <- as.list(maxHeight)

# Combine the scatterplot with the two marginal boxplots

# Create a new gtable

gt <- gtable(widths = unit(c(7, 2), "null"), height = unit(c(2, 7), "null"))

# Instert gt1, gt2 and gt3 into the new gtable

gt <- gtable_add_grob(gt, gt1, 2, 1)

gt <- gtable_add_grob(gt, gt2, 1, 1)

gt <- gtable_add_grob(gt, gt3, 2, 2)

# And render the plot

grid.newpage()

grid.draw(gt)

grid.rect(x = 0.5, y = 0.5, height = 0.995, width = 0.995, default.units = "npc",

gp = gpar(col = "black", fill = NA, lwd = 1))

**leave a comment**for the author, please follow the link and comment on their blog:

**Sandy Muspratt's R Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...