# Split violin plots

**Ecology in silico**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Violin plots are useful for comparing distributions. When data are grouped by a factor with two levels (e.g. males and females), you can split the violins in half to see the difference between groups. Consider a 2 x 2 factorial experiment: treatments A and B are crossed with groups 1 and 2, with N=1000.

1 2 3 4 5 6 7 8 9 | # Simulate data n.each <- 1000 A1 <- rnorm(n.each, 2, 1) A2 <- rnorm(n.each, 1.5, 2) B1 <- rnorm(n.each, 4, 1.5) B2 <- rnorm(n.each, 0, 1) values <- c(A1, A2, B1, B2) treatment <- rep(c("A", "B"), each=n.each*2) group <- rep(c(1, 2, 1, 2), each=n.each) |

Boxplots are often used:

1 2 | par(bty="n") boxplot(values ~ group*treatment, main="Box plot", col=rep(c("purple", "lightblue"), 2)) |

This gives us a rough comparison of the distribution in each group, but sometimes it’s nice to visualize the kernel density estimates instead.

I recently ran into this issue and tweaked the vioplot() function from
the vioplot
package by Daniel Adler to make split violin plots.
With vioplot2(), the `side`

argument specifies whether to plot the density on “both”, the “left”, or
the “right” side.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | require(vioplot) require(devtools) require(digest) source_gist("https://gist.github.com/mbjoseph/5852613") plot(x=NULL, y=NULL, xlim = c(0.5, 2.5), ylim=c(min(values), max(values)), type="n", ann=FALSE, axes=F) axis(1, at=c(1, 2), labels=c("A", "B")) axis(2) for (i in unique(treatment)) { for (j in unique(group)){ vioplot2(values[which(treatment == i & group == j)], at = ifelse(i == "A", 1, 2), side = ifelse(j == 1, "left", "right"), col = ifelse(j == 1, "purple", "lightblue"), add = T) } } title("Violin plot", xlab="Treatment") legend("bottomright", fill = c("purple", "lightblue"), legend = c("Group 1", "Group 2"), box.lty=0) |

Last but not least, Peter Kampstra’s beanplot package uses beanplot() to make split density plots, but 1) plots a rug rather than a quantile box, 2) includes a line for the overall mean or median, and 3) makes it easier to change the kernel function.

1 2 3 4 5 6 7 8 9 | require(beanplot) beanplot(values ~ group*treatment, ll = 0.04, main = "Bean plot", side = "both", xlab="Treatment", col = list("purple", c("lightblue", "black")), axes=F) axis(1, at=c(1, 2), labels=c("A", "B")) axis(2) legend("bottomright", fill = c("purple", "lightblue"), legend = c("Group 1", "Group 2"), box.lty=0) |

There are more ways than one to skin a cat, and what one uses will probably come to personal preference.

**leave a comment**for the author, please follow the link and comment on their blog:

**Ecology in silico**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.