Longitudinal heat plots
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
During our research on the effect of prednisone consumption during pregency on health outcomes of the baby (Palmsten K, Rolland M, Hebert MF, et al., Patterns of prednisone use during pregnancy in women with rheumatoid arthritis: Daily and cumulative dose. Pharmacoepidemiol Drug Saf. 2018 Apr;27(4):430-438. https://www.ncbi.nlm.nih.gov/pubmed/29488292) we developed a custom plot to visualize for each patient their daily and cumulative consumption of prednisone during pregenancy. Since the publication these plots have raised some interest so here is the code used to produce them.
Data needs to be in the following format:
1 line per patient
1 column for the patient ID (named id) and then 1 column per unit of time (here days) reporting the measure for that day
To illustrate the type of data we dealt with I first generate a random dataset containing 25 patients followed for n days (n different for each patient, randomly chosen between 50 and 200) with a daily consumption value randomly selected between 0 and a maximum dose (randomly determined for each patient between 10 and 50 ).
Then we compute the cumulated consumption ie sum of all previous days.
# initial parameters for simulating data n_indiv <- 25 min_days <- 50 max_days <- 200 min_dose <- 10 max_dose <- 50 # list of ids id_list <- str_c("i", 1:n_indiv) # intializing empty table my_data <- as.data.frame(matrix(NA, n_indiv, (max_days+1))) colnames(my_data) <- c("id", str_c("d", 1:max_days)) my_data$id <- id_list # daily simulated data set.seed(113) for(i in 1:nrow(my_data)){ # n days follow up n_days <- round(runif(1, min_days, max_days)) # maximum dose dose <- round(runif(1, min_dose, max_dose)) # random daily value my_data[i,2:ncol(my_data)] <- c(runif(n_days, 0, max_dose), rep(NA,(max_days-n_days))) } # cumulative simulated data my_cum_data <- my_data for(i in 3:ncol(my_cum_data)){ my_cum_data[[i]] <- my_cum_data[[i]] + my_cum_data[[i-1]] }
Our plots use the legend.col function found here: https://aurelienmadouasse.wordpress.com/2012/01/13/legend-for-a-continuous-color-scale-in-r/
Here is the longitudinal heat plot function:
# Color legend on the top long_heat_plot <- function(my_data, cutoff, xmax) { # my_data: longitudinal data with one line per individual, 1st column with id, and then one column per unit of time # cutoff: cutoff value for color plot, all values above cutoff are same color # xmax: x axis max value n_lines <- nrow(my_data) line_count <- 1 # color scale COLS <- rev(heat.colors(cutoff)) # plotting area par(oma=c(1,1,4,1), mar=c(2, 2, 2, 4), xpd=TRUE) # plot init plot(1,1,xlim=c(0,xmax), ylim=c(1, n_lines), pch='.', ylab='Individual',xlab='Time unit',yaxt='n', cex.axis = 0.8) # plot line for each woman one at a time for (i in 1 : n_lines) { # get id id1 <- my_data$id[i] # get trajectory data maxed at max_val id_traj <- my_data[my_data$id == id1, 2:ncol(my_data)] # get last day END <- max(which(!is.na(id_traj))) # plot dotted line x1 <- 1:xmax y1 <- rep(i,xmax) lines(x1, y1, lty=3) for (j in 1 : (ncol(my_data) - 1)) { # trim traj to max val val <- min(id_traj[j], cutoff) # plot traj points(j, i, col=COLS[val], pch=20, cex=1) } # add limit line points((END+1), i, pch="|", cex=0.9) } # add legend legend.col(col = COLS, lev = 1:cutoff) mtext(side=3, line = 3, 'unit of measurement') opar <- par() text_size <- opar$cex.main }
Then we generate the corresponding plots:
long_heat_plot(my_data, 50, 200)
long_heat_plot(my_cum_data, 5000, 200)
And here is what we did in our study:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.