Last time we have discussed the two formats of longitudinal data and visualised the individual growth trajectories using an imaginary randomised controlled trial data-set. But could we estimate the overall trajectory of the outcomes along time and see if it’s increasing, decreasing, or stable? Yes, of course, we could estimate that in multilevel growth models (aka hierarchical models or mixed models).
0. Generate a longitudinal dataset and convert it into long format
Let’s start by getting the long data-set as we have done in my previous post.
library(MASS) dat.tx.a <- mvrnorm(n=250, mu=c(30, 20, 28), Sigma=matrix(c(25.0, 17.5, 12.3, 17.5, 25.0, 17.5, 12.3, 17.5, 25.0), nrow=3, byrow=TRUE)) dat.tx.b <- mvrnorm(n=250, mu=c(30, 20, 22), Sigma=matrix(c(25.0, 17.5, 12.3, 17.5, 25.0, 17.5, 12.3, 17.5, 25.0), nrow=3, byrow=TRUE)) dat <- data.frame(rbind(dat.tx.a, dat.tx.b)) names(dat) = c('measure.1', 'measure.2', 'measure.3') dat <- data.frame(subject.id=factor(1:500), tx=rep(c('A', 'B'), each=250), dat) rm(dat.tx.a, dat.tx.b) dat <- reshape(dat, varying=c('measure.1', 'measure.2', 'measure.3'), idvar='subject.id', direction='long')
1. Multilevel growth models
There’re many R packages to help your to do multilevel analysis but I found
lme4 to be one of the best because of its simplicity and ability to fit generalised models (e.g. for binary and count outcomes). A popular alternative would be
nlme, which should provide similar results for continuous outcomes (with normal/Gaussian distribution). So let’s start analysing the overall trend of the depression score.
# install.packages('lme4') library(lme4) m <- lmer(measure ~ time + (1 | subject.id), data=dat)
You’d be very familiar with the model fitting statement if you have done a linear regression before. Basically it’s just the
lm() function with the additional random effect part in the formula, i.e.
(1 | subject.id) in this case. Random effects are very useful because it helps to capture the variance without sacrificing too many degrees of freedom (i.e. estimating too many nuisance variables). For example, if we used a linear regression instead to capture the interpersonal difference, we’d need to make 499 (500-1) dummy variables for
subject.id, which is probably not what we want. After all, comparing the depression scores of participant A and B is not the research question that we’re interested in.
There’re two types of random effects: random intercepts and random slopes. In this particular example, we only used the random intercept, meaning that the average depression score over time is varied by person but the depression trajectory (the slope) is assumed to be heterogeneous. If you have theoretical knowledge that suggest the depression trajectory may be different from person to person, you may want to try out the random slope model using the formula
measure ~ (time | subject.id).
A theoretical side note: One of the most problematic limitations in linear models is it assumes the error terms to be independent of each other. This could be true for simple random sampling in cross-sectional data but probably not in multilevel samples (e.g. data from cluster random sampling scheme) and longitudinal data. The failure in capturing the interpersonal differences would result in inflated Type I error. In other word, every significant findings you get in the linear regression could be just wrong. Therefore, it’s often good to try multilevel regressions if you suspect the data-set is not independently distributed.
Here is the results of the multilevel model using
summary(m) Linear mixed model fit by REML ['lmerMod'] Formula: measure ~ time + (1 | subject.id) Data: dat REML criterion at convergence: 9639.6 Scaled residuals: Min 1Q Median 3Q Max -2.45027 -0.70596 0.00832 0.65951 2.78759 Random effects: Groups Name Variance Std.Dev. subject.id (Intercept) 9.289 3.048 Residual 28.860 5.372 Number of obs: 1500, groups: subject.id, 500 Fixed effects: Estimate Std. Error t value (Intercept) 29.8508 0.3915 76.25 time -2.2420 0.1699 -13.20 Correlation of Fixed Effects: (Intr) time -0.868
The REML criterion is an indicator for estimation convergence. Normally speaking you don’t need to be too worried about this because if there’s potential convergence problem in estimation, the
lmer() will give you some warnings.
Random effects section of the results state the variance structure of the data. There are two sources of variance in this model: the residual (the usual one as in linear models) and the interpersonal difference (i.e.
subject.id). One common way to quantify the strength of interpersonal difference is intraclass correlation coefficient (ICC). It is possible to compute ICC from the multilevel model and it is just 9.289 ÷ (9.289 + 28.860) = 0.243, which means 24.3% of the variation in depression score could be explained by interpersonal difference.
Let’s move to the
Fixed effects section. Hmmm… Where are the p-values? Well, although SAS and other statistical software do provide p-values for fixed effects in multilevel models, their calculations are not a consensus among statisticians. To put it simple, the degrees of freedom associated with these t-statistics are not well understood and without the degrees of freedom, we don’t know the exact distribution of the t-statistics and thus we don’t know the p-values. SAS and other programs have a workaround using approximation, which the developer of the
lme4 package doesn’t feel very comfortable with. As a result, the
lmer package intentionally not reporting the p-values in the results.
Acknowledging all the limitations, we could in fact get approximated p-values with another package
lmerTest which builds on top of
1.1. Multilevel growth models with approximate p-values
The codes are exactly the same except we’re now using the
# install.packages('lme4') # Please note the explanation and limitations: # https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html library(lmerTest) m <- lmer(measure ~ time + (1 | subject.id), data=dat)
The results are very similar too, but now we got the approximate degrees of freedom and p-values. So we’re now confident to say that the average participant of the RCT are now having decreasing depression score over time. The decrement is about 2.24 points for one time point.
summary(m) Linear mixed model fit by REML t-tests use Satterthwaite approximations to degrees of freedom [merModLmerTest] Formula: measure ~ time + (1 | subject.id) Data: dat REML criterion at convergence: 9639.6 Scaled residuals: Min 1Q Median 3Q Max -2.45027 -0.70596 0.00832 0.65951 2.78759 Random effects: Groups Name Variance Std.Dev. subject.id (Intercept) 9.289 3.048 Residual 28.860 5.372 Number of obs: 1500, groups: subject.id, 500 Fixed effects: Estimate Std. Error df t value Pr(>|t|) (Intercept) 29.8508 0.3915 1449.4000 76.25 <2e-16 *** time -2.2420 0.1699 999.0000 -13.20 <2e-16 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Correlation of Fixed Effects: (Intr) time -0.868
1.2. Calculating 95% CI and PI
Sometimes we want to plot the average value on top of the individual trajectories. To show the uncertainty associated with the averages, we’d need to use the fitted model to calculate the fitted values, the 95% confidence intervals (CI), the 95% prediction intervals (PI).
# See for details: http://glmm.wikidot.com/faq dat.new <- data.frame(time=1:3) dat.new$measure <- predict(m, dat.new, re.form=NA) m.mat <- model.matrix(terms(m), dat.new) dat.new$var <- diag(m.mat %*% vcov(m) %*% t(m.mat)) + VarCorr(m)$subject.id dat.new$pvar <- dat.new$var + sigma(m)^2 dat.new$ci.lb <- with(dat.new, measure - 1.96*sqrt(var)) dat.new$ci.ub <- with(dat.new, measure + 1.96*sqrt(var)) dat.new$pi.lb <- with(dat.new, measure - 1.96*sqrt(pvar)) dat.new$pi.ub <- with(dat.new, measure + 1.96*sqrt(pvar))
The first line of the code specify the time points that we want the average values, which are simply time 1, 2, and 3 in our case. The second line use
predict() function to get the average values from the model ignoring the conditional random effects (
re.form=NA). Lines 3 and 4 calculate the variances of the average values. It’s basically the matrix cross-products plus the variance of the random intercept. Line 5 calculates the variances of a single observation, which is the variance of the average values plus the residual variance. Lines 5 to 8 are just the standard calculation of 95% CIs and PIs assuming normal distribution. And here is the data-set for plotting the figure:
dat.new time measure var pvar ci.lb ci.ub pi.lb pi.ub 1 27.72421 10.85669 43.04054 21.26611 34.18231 14.865574 40.58285 2 25.22342 10.82451 43.00835 18.77490 31.67194 12.369592 38.07725 3 22.72263 10.85669 43.04054 16.26453 29.18073 9.863993 35.58127
2. Plot the average values
Finally, let’s plot the averages with 95% CIs and PIs. Notice that the PIs are much wider than the CIs. That means we’re much more confident in predicting the average than a single value.
ggplot(data=dat.new, aes(x=time, y=measure)) + geom_line(data=dat, alpha=.02, aes(group=subject.id)) + geom_errorbar(width=.02, colour='red', aes(x=time-.02, ymax=ci.ub, ymin=ci.lb)) + geom_line(colour='red', linetype='dashed', aes(x=time-.02)) + geom_point(size=3.5, colour='red', fill='white', aes(x=time-.02)) + geom_errorbar(width=.02, colour='blue', aes(x=time+.02, ymax=pi.ub, ymin=pi.lb)) + geom_line(colour='blue', linetype='dashed', aes(x=time+.02)) + geom_point(size=3.5, colour='blue', fill='white', aes(x=time+.02)) + theme_bw()
If you’re as data-sensitive as me, you should have probably noticed the fitted values aren’t quite fit to the actual data. This happens because the model is not specified well. There’re at least two ways to specify the models better, do you know what are they? We’ll talk about it in the next posts. Stay tuned.