ggplot2: Quick Heatmap Plotting

January 25, 2010

(This article was first published on Learning R, and kindly contributed to R-bloggers)

A post on FlowingData blog demonstrated how to quickly make a heatmap below using R base graphics.

This post shows how to achieve a very similar result using ggplot2.


Data Import

FlowingData used last season’s NBA basketball statistics provided by, and the csv-file with the data can be downloaded directly from its website.

> nba <- read.csv("")

The players are ordered by points scored, and the Name variable converted to a factor that ensures proper sorting of the plot.

> nba$Name <- with(nba, reorder(Name, PTS))

Whilst FlowingData uses heatmap function in the stats-package that requires the plotted values to be in matrix format, ggplot2 operates with dataframes. For ease of processing, the dataframe is converted from wide format to a long format.

The game statistics have very different ranges, so to make them comparable all the individual statistics are rescaled.

> library(ggplot2)
> nba.m <- melt(nba)
> nba.m <- ddply(nba.m, .(variable), transform,
+     rescale = rescale(value))


There is no specific heatmap plotting function in ggplot2, but combining geom_tile with a smooth gradient fill does the job very well.

> (p <- ggplot(nba.m, aes(variable, Name)) + geom_tile(aes(fill = rescale),
+     colour = "white") + scale_fill_gradient(low = "white",
+     high = "steelblue"))

A few finishing touches to the formatting, and the heatmap plot is ready for presentation.

> base_size <- 9
> p + theme_grey(base_size = base_size) + labs(x = "",
+     y = "") + scale_x_discrete(expand = c(0, 0)) +
+     scale_y_discrete(expand = c(0, 0)) + opts(legend.position = "none",
+     axis.ticks = theme_blank(), axis.text.x = theme_text(size = base_size *
+         0.8, angle = 330, hjust = 0, colour = "grey50"))

Rescaling Update

In preparing the data for the above plot all the variables were rescaled so that they were between 0 and 1.

Jim rightly pointed out in the comments (and I did not initally get it) that the heatmap-function uses a different scaling method and therefore the plots are not identical. Below is an updated version of the heatmap which looks much more similar to the original.

> nba.s <- ddply(nba.m, .(variable), transform,
+     rescale = scale(value))
> last_plot() %+% nba.s

To leave a comment for the author, please follow the link and comment on their blog: Learning R. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)