Was doing a little presentation to our research group and had to explain the difficulties of ‘collapsing’ longitudinal data into a single measure when the Y var is quite variable. For the particular Y var of interest, it represents burden of disease, so a high Y var for a long time is indicative of high risk, compared to a low value for a similar time. Hence you have issues using with the mean, or the AUC. There’s a lot more to it than that, but that’s the gist of the point of this graph. Sharing the code cause it might be useful to someone else at some point.
I wrote this post in RStudio using the R Markdown language and then knitr to turn in into markdown (.md), and then pandoc to turn it into html. The original file is available here on github.
system(“pandoc -s ggplot_post_text_example.md -o ggplot_post_text_example.html”)
Set up dummy data
library(ggplot2)
# Set up the data and text separately
dat <- data.frame(frame = c(rep("A", 6), rep("B", 2), rep("C", 11), rep("D",
11)), y = c(rep(10, 6), rep(10, 2), rep(5, 11), seq(5, 10, 0.5)), x = c(seq(13,
18, 1), seq(17, 18, 1), seq(8, 18, 1), seq(8, 18, 1)))
txt <- data.frame(label = c("Mean - 10", "AUC - 50", "Mean - 10", "AUC - 1",
"Mean - 5", "AUC - 50", "Mean - 7.5", "AUC - 75"), x = rep(17.5, 8), y = rep(c(13.5,
12), 4), frame = c(rep("A", 2), rep("B", 2), rep("C", 2), rep("D", 2)))
And here’s the plot
ggplot(data = dat, aes(x = x, ymax = y, ymin = 0)) + geom_ribbon(data = dat) +
facet_wrap(~frame) + scale_y_continuous(limits = c(-0.1, 14)) + scale_x_continuous(limits = c(5,
20)) + labs(y = "Y var", x = "X var") + geom_text(data = txt, aes(y = y,
label = label))
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...


Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).