Mimic Excel’s Conditional Formatting in R

[This article was first published on triKnowBits, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


The DT package is an interface between R and the JavaScript DataTables library (RStudio DT documentation). In Example 3 (at this page) they show how to heatmap-format a table. This post modifies the example to
  1. format each column individually
  2. shade in green rather than red
  3. use base R syntax rather than piping
  4. omit the extra accoutrements of the displayed table (from the answer to this stackoverflow post), except
  5. include a title.
Here we generate data similar to that in Example 3, but with average values growing by column
set.seed(12345)
df = as.data.frame(
  cbind(round(rnorm(10, mean = 0), 3), 
  round(rnorm(10, mean = 4), 3), 
  round(rnorm(10, mean = 8), 3), 
  round(rnorm(10, mean = 16), 3), 
  round(rnorm(10, mean = 32), 3), 
  sample(0:1, 10, TRUE)))
Using the code in the example — modified to green — the darker values naturally appear in columns V4 and V5.


But that’s not what we want.

For each column to have it’s own scale, simply apply RStudio’s algorithm to each column of df in a loop. The trick to notice is that formatStyle wants a datatable object as its first argument, and produces a datatable object as its result. Therefore, start off with a plain-Jane datatable and successively format each column, saving the result each time. Almost like building a ggplot. At the end, view the final result.
# Start with a (relatively) plain, unformatted datatable object
dt <- DT::datatable(df, 
                    options = list(dom = 't', ordering = FALSE),
                    caption = "Example 3 By Column")
# Loop through the columns formatting according to that column's distribution
for (j in seq_along(df)) {
  # Create breaks for shading column values high to low
  brks <- stats::quantile(x <- df[[j]], probs = seq(.05, .95, .05), na.rm = TRUE)
  # Create shades of green for backgrounds
  y <- round(seq(255, 40, length.out = length(brks) + 1), 0)
  clrs <- paste0("rgb(", y, ", 255,", y, ")")
  # Format cells in j-th column
  dt <- DT::formatStyle(dt, j, backgroundColor = DT::styleInterval(brks, clrs))
}
dt

Actuaries in the crowd might recognize the image at the top of the post as the table of link ratios from the GenIns dataset in the ChainLadder package. There do not appear to be any distinctive trends in the ratios by age.




To leave a comment for the author, please follow the link and comment on their blog: triKnowBits.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)