R Tutorial: Add confidence intervals to dotchart

May 15, 2011
By

(This article was first published on Maximize Productivity with Industrial Engineer and Operations Research Tools, and kindly contributed to R-bloggers)

Recently I was working on a data visualization project.  I wanted to visualize summary statistics by category of the data.  Specifically I wanted to see a simple dispersion of data with confidence intervals for each category of data. 

R is my tool of choice for data visualization.  My audience was a general audience so I didn't want to use boxplots or other density types of visualization methods.  I wanted a simple mean and 95% (~ roughly 2 standard deviations) confidence around the mean.  My method of choice was to use the dotchart function.  Yet that function is limited to showing the data points and not the dispersion of the data.  So I needed to layer in the confidence intervals. 

The great thing about R is that the functions and objects are pretty much layered.  I can create one R object and add to it as I see fit.  This is mainly true with most plotting functions in R.  I knew that I could use the lines function to add lines to an existing plot.  This method worked great for my simplistic plot and adds another tool to my R toolbox.

Here is the example dotchart with confidence intervals R script using the "mtcars" dataset that is provided with any R installation.


### Create data frame with mean and std dev
x <- data.frame(mean=tapply(mtcars$mpg, list(mtcars$cyl), mean), sd=tapply(mtcars$mpg, list(mtcars$cyl), sd) )

###  Add lower and upper levels of confidence intervals
x$LL <- x$mean-2*x$sd
x$UL <- x$mean+2*x$sd

### plot dotchart with confidence intervals

title <- "MPG by Num. of Cylinders with 95% Confidence Intervals"

dotchart(x$mean, col="blue", xlim=c(floor(min(x$LL)/10)*10, ceiling(max(x$UL)/10)*10), main=title )

for (i in 1:nrow(x)){
    lines(x=c(x$LL[i],x$UL[i]), y=c(i,i))
}
grid()






And here is the example of the finished product.

To leave a comment for the author, please follow the link and comment on his blog: Maximize Productivity with Industrial Engineer and Operations Research Tools.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.