Using R: correlation heatmap, take 2

March 3, 2014
By

(This article was first published on There is grandeur in this view of life » R, and kindly contributed to R-bloggers)

Apparently, this turned out to be my most popular post ever.  Of course there are lots of things to say about the heatmap (or quilt, tile, guilt plot etc), but what I wrote was literally just a quick celebratory post to commemorate that I’d finally grasped how to combine reshape2 and ggplot2 to quickly make this colourful picture of a correlation matrix.

However, I realised there is one more thing that is really needed, even if just for the first quick plot one makes for oneself: a better scale. The default scale is not the best for correlations, which range from -1 to 1, because it’s hard to tell where zero is. We use the airquality dataset for illustration as it actually has some negative correlations. In ggplot2, it’s very easy to get a scale that has a midpoint and a different colour in each direction. It’s called scale_colour_gradient2, and we just need to add it. I also set the limits to -1 and 1, which doesn’t change the colour but fills out the legend for completeness. Done!

data <- airquality[,1:4]
library(ggplot2)
library(reshape2)
qplot(x=Var1, y=Var2, data=melt(cor(data, use="p")), fill=value, geom="tile") +
   scale_fill_gradient2(limits=c(-1, 1))

correlation_heatmap2


Postat i:computer stuff, data analysis, english Tagged: #blogg100, ggplot2, R, reshape2

To leave a comment for the author, please follow the link and comment on his blog: There is grandeur in this view of life » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.