# Vertical Histogram

February 26, 2014
By

(This article was first published on uu kk, and kindly contributed to R-bloggers)

In the process of munging data for my current project I came across the need to compare (visually) the difference between two modes within the same dataset. I was using a simple scatterplot and setting the alpha in the hopes that the over-plotting would indicate which was the major mode. Unfortunately, the size of the data overwhelmed this approach.

I only wanted to use a single image and it was important that I keep the scatterplot to show other features of the data. I started looking for a way to combine a histogram (rotated 90 degrees) with the scatterplot to help describe the density within the plot. A quick search for how to do this in R turned up empty so I decided to implement my own version of such a plot.

Certainly, there are other ways to describe the features that I am trying to present here but in this particular case the following code worked out nicely. Hopefully it proves useful to others as well.

`plot.vertical.hist <- function(data,breaks=500) {    agg <- aggregate(data\$Y, by=list(xs=data\$X), FUN=mean)    hs <- hist(agg\$x / 10000, breaks=breaks, plot=FALSE)    old.par <- par(no.readonly=TRUE)    mar.default <- par('mar')    mar.left <- mar.default    mar.right <- mar.default    mar.left[4] <- 0    mar.right[2] <- 0    # Main plot     par (fig=c(0,0.8,0,1.0), mar=mar.left)    plot (agg\$xs, agg\$x / 10000,          xlab="X", ylab="Y",          main="Vertical Histogram Side Plot",          pch=19, col=rgb(0.5,0.5,0.5,alpha=0.5))    grid ()    # Vertical histogram of the same data    par (fig=c(0.8,1.0,0.0,1.0), mar=mar.right, new=TRUE)    plot (NA, type='n', axes=FALSE, yaxt='n',          xlab='Frequency', ylab=NA, main=NA,          xlim=c(0,max(hs\$counts)),          ylim=c(1,length(hs\$counts)))    axis (1)    arrows(rep(0,length(hs\$counts)),1:length(hs\$counts),           hs\$counts,1:length(hs\$counts),           length=0,angle=0)    par(old.par)    invisible ()}`

Results look similar to the following:

Initially, I experimented with rug or barplot(…, horiz=TRUE). Unfortunately, rug isn’t available on the left or right side and would suffer from the same problem that the alpha settings did and I was unable to get the alignment worked out when using barplot.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...