Legendary Plots

March 12, 2011
By

(This article was first published on 4D Pie Charts » R, and kindly contributed to R-bloggers)

I was recently pointed in the direction of a thermal comfort model by the engineering company Arup (p27–28 of this pdf). Figure 3 at the top of p28 caught my attention.

Arup thermal comfort model, figure 3

It’s mostly a nice graph; there’s not too much junk in it. One thing that struck me was that there is an awful lot of information in the legend, and that I found it impossible to retain all that information while switching between the plot and the legend.

The best way to improve this plot then is to find a way to simplify the legend. Upon closer inspection, it seems that there is a lot of information that is repeated. For example, there are only two temperature combinations, and three levels of direct solar energy. Humidity and diffused solar energy are kept the same in all cases. That makes it really easy for us: our five legend options are

Outdoor temp (deg C)Direct solar energy (W/m^2)
32700
32150
32500
29500
29150

Elsewhere we can explain that the mezannine/platform temps are always 2/4 degrees higher than outdoors, and that the humidity is always 50%, and that the diffused solar energy is always 100W/m^2.

Living in Buxton, one of the coldest, rainiest towns in the UK, it amuses me to see that their “low” outdoor temperature is 29°C.

The other thing to note is that we have two variables mapped to the hue. For just five cases, this is just about acceptable, but it isn’t the best option and it won’t scale to many more categories. It’s generally considered best practice to work in HCL color space when mapping variables to colours. I would be tempted to map temperature to hue – whether you pick red as hot and blue as cold or the other way around depends upon how many astronomers you have in your target audience. Then I’d map luminance (lightness) to solar energy: more sunlight = lighter line.

I don’t have the values to exactly recreate the dataset, but here are some made up numbers with the new legend. Notice the combined outdoor temp/direct solar energy variable.

time_points <- 0:27
n_time_points <- length(time_points)
n_cases <- 5
comfort_data <- data.frame(
  time = rep.int(time_points, n_cases),
  comfort = jitter(rep(-2:2, each = n_time_points)),
  outdoor.temperature = rep(
    c(32, 29),
    times = c(3 * n_time_points, 2 * n_time_points)
  ),
  direct.solar.energy = rep(
    c(700, 150, 500, 500, 150),
    each = n_time_points
  )
)
comfort_data$combined <- with(comfort_data,
  factor(paste(outdoor.temperature, direct.solar.energy, sep = ", "))
)

We manually pick the colours to use in HCL space (using str_detect to examine the factor levels).

library(stringr)
cols <- hcl(
  h = with(comfort_data, ifelse(str_detect(levels(combined), "29"), 0, 240)),
  c = 100,
  l = with(comfort_data,
    ifelse(str_detect(levels(combined), "150"), 20,
    ifelse(str_detect(levels(combined), "500"), 50, 80))
  )
)

Drawing the plot is very straightforward, it’s just a line plot.

library(ggplot2)
p <- ggplot(comfort_data, aes(time, comfort, colour = combined)) +
  geom_line(size = 2) +
  scale_colour_manual(
    name = expression(paste(
      "Outdoor temp (", degree, C, "), Direct solar (", W/m^2, ")"
    )),
    values = cols) +
  xlab("Time (minutes)") +
  ylab("Comfort")
p

My version of the plot, with an improved legend

Sensible people should stop here, and write the additional detail in the figure caption. There is currently no sensible way of writing annotations outside of the plot area (annotate only works inside panels). The following hack was devised by Baptiste Auguie, read this forum thread for other variations.

library(gridExtra)
caption <- tableGrob(
  matrix(
    expression(
      paste(
        "Mezzanine temp is 2", degree, C, " warmer than outdoor temp"
      ),
      paste(
        "Platform temp is 4", degree, C, " warmer than outdoor temp"
      ),
      paste("Humidity is always 50%"),
      paste(
        "Diffused solar energy is always 100", W/m^2
      )
    )
  ),
  parse = TRUE,
  theme = theme.list(
    gpar.corefill = gpar(fill = NA, col = NA),
    core.just = "center"
  )
)
grid.arrange(p,  sub=caption)

The additional information is included in the plot's subcaption


Tagged: dataviz, ggplot2, keys, legends, r

To leave a comment for the author, please follow the link and comment on his blog: 4D Pie Charts » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , , , ,

Comments are closed.