Adding Measures of Central Tendency to Histograms in R

October 4, 2012
By

(This article was first published on Mollie's Research Blog, and kindly contributed to R-bloggers)

Building on the basic histogram with a density plot, we can add measures of central tendency (in this case, mean and median) and a legend.

Like last time, we’ll use the beaver data from the datasets package.

hist(beaver1\$temp, # histogram
col = "peachpuff", # column color
border = "black",
prob = TRUE, # show densities instead of frequencies
xlim = c(36,38.5),
ylim = c(0,3),
xlab = "Temperature",
main = "Beaver #1")
lines(density(beaver1\$temp), # density plot
lwd = 2, # thickness of line
col = "chocolate3")

Next we’ll add a line for the mean:

abline(v = mean(beaver1\$temp),
col = "royalblue",
lwd = 2)

And a line for the median:

abline(v = median(beaver1\$temp),
col = "red",
lwd = 2)

And then we can also add a legend, so it will be easy to tell which line is which.

legend(x = "topright", # location of legend within plot area
c("Density plot", "Mean", "Median"),
col = c("chocolate3", "royalblue", "red"),
lwd = c(2, 2, 2))

All of this together gives us the following graphic:

In this example, the mean and median are very close, as we can see by using median() and mode().

> mean(beaver1\$temp)
 36.86219
> median(beaver1\$temp)
 36.87

We can do like we did in the previous post and graph beaver1 and beaver2 together by adding a layout line and changing the limits of x and y. The full code for this is available in a gist.

Here’s the output from that code:

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...