Analysing longitudinal data: Multilevel growth models (I)

Posted on August 22, 2015 by Frederick Ho in R bloggers | 0 Comments

[This article was first published on DataScience+, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last time we have discussed the two formats of longitudinal data and visualised the individual growth trajectories using an imaginary randomised controlled trial data-set. But could we estimate the overall trajectory of the outcomes along time and see if it’s increasing, decreasing, or stable? Yes, of course, we could estimate that in multilevel growth models (aka hierarchical models or mixed models).

0. Generate a longitudinal dataset and convert it into long format

Let’s start by getting the long data-set as we have done in my previous post.

library(MASS)
dat.tx.a <- mvrnorm(n=250, mu=c(30, 20, 28), 
                    Sigma=matrix(c(25.0, 17.5, 12.3, 
                                   17.5, 25.0, 17.5, 
                                   12.3, 17.5, 25.0), nrow=3, byrow=TRUE))
dat.tx.b <- mvrnorm(n=250, mu=c(30, 20, 22), 
                    Sigma=matrix(c(25.0, 17.5, 12.3, 
                                   17.5, 25.0, 17.5, 
                                   12.3, 17.5, 25.0), nrow=3, byrow=TRUE))
dat <- data.frame(rbind(dat.tx.a, dat.tx.b))
names(dat) = c('measure.1', 'measure.2', 'measure.3')
dat <- data.frame(subject.id=factor(1:500), tx=rep(c('A', 'B'), each=250), dat)
rm(dat.tx.a, dat.tx.b)
dat <- reshape(dat, varying=c('measure.1', 'measure.2', 'measure.3'), 
               idvar='subject.id', direction='long')

1. Multilevel growth models

There’re many R packages to help your to do multilevel analysis but I found lme4 to be one of the best because of its simplicity and ability to fit generalised models (e.g. for binary and count outcomes). A popular alternative would be nlme, which should provide similar results for continuous outcomes (with normal/Gaussian distribution). So let’s start analysing the overall trend of the depression score.

# install.packages('lme4')
library(lme4)
m <- lmer(measure ~ time + (1 | subject.id), data=dat)

You’d be very familiar with the model fitting statement if you have done a linear regression before. Basically it’s just the lm() function with the additional random effect part in the formula, i.e. (1 | subject.id) in this case. Random effects are very useful because it helps to capture the variance without sacrificing too many degrees of freedom (i.e. estimating too many nuisance variables). For example, if we used a linear regression instead to capture the interpersonal difference, we’d need to make 499 (500-1) dummy variables for subject.id, which is probably not what we want. After all, comparing the depression scores of participant A and B is not the research question that we’re interested in.
There’re two types of random effects: random intercepts and random slopes. In this particular example, we only used the random intercept, meaning that the average depression score over time is varied by person but the depression trajectory (the slope) is assumed to be heterogeneous. If you have theoretical knowledge that suggest the depression trajectory may be different from person to person, you may want to try out the random slope model using the formula measure ~ (time | subject.id).

A theoretical side note: One of the most problematic limitations in linear models is it assumes the error terms to be independent of each other. This could be true for simple random sampling in cross-sectional data but probably not in multilevel samples (e.g. data from cluster random sampling scheme) and longitudinal data. The failure in capturing the interpersonal differences would result in inflated Type I error. In other word, every significant findings you get in the linear regression could be just wrong. Therefore, it’s often good to try multilevel regressions if you suspect the data-set is not independently distributed.

Here is the results of the multilevel model using summary() function.

summary(m)
Linear mixed model fit by REML ['lmerMod']
Formula: measure ~ time + (1 | subject.id)
   Data: dat

REML criterion at convergence: 9639.6

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.45027 -0.70596  0.00832  0.65951  2.78759 

Random effects:
 Groups     Name        Variance Std.Dev.
 subject.id (Intercept)  9.289   3.048   
 Residual               28.860   5.372   
Number of obs: 1500, groups:  subject.id, 500

Fixed effects:
            Estimate Std. Error t value
(Intercept)  29.8508     0.3915   76.25
time         -2.2420     0.1699  -13.20

Correlation of Fixed Effects:
     (Intr)
time -0.868

The REML criterion is an indicator for estimation convergence. Normally speaking you don’t need to be too worried about this because if there’s potential convergence problem in estimation, the lmer() will give you some warnings.

The Random effects section of the results state the variance structure of the data. There are two sources of variance in this model: the residual (the usual one as in linear models) and the interpersonal difference (i.e. subject.id). One common way to quantify the strength of interpersonal difference is intraclass correlation coefficient (ICC). It is possible to compute ICC from the multilevel model and it is just 9.289 ÷ (9.289 + 28.860) = 0.243, which means 24.3% of the variation in depression score could be explained by interpersonal difference.

Let’s move to the Fixed effects section. Hmmm… Where are the p-values? Well, although SAS and other statistical software do provide p-values for fixed effects in multilevel models, their calculations are not a consensus among statisticians. To put it simple, the degrees of freedom associated with these t-statistics are not well understood and without the degrees of freedom, we don’t know the exact distribution of the t-statistics and thus we don’t know the p-values. SAS and other programs have a workaround using approximation, which the developer of the lme4 package doesn’t feel very comfortable with. As a result, the lmer package intentionally not reporting the p-values in the results.

Acknowledging all the limitations, we could in fact get approximated p-values with another package lmerTest which builds on top of lme4.

1.1. Multilevel growth models with approximate p-values

The codes are exactly the same except we’re now using the lmerTest package.

# install.packages('lme4')
# Please note the explanation and limitations: 
# https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html
library(lmerTest)
m <- lmer(measure ~ time + (1 | subject.id), data=dat)

The results are very similar too, but now we got the approximate degrees of freedom and p-values. So we’re now confident to say that the average participant of the RCT are now having decreasing depression score over time. The decrement is about 2.24 points for one time point.

summary(m)
Linear mixed model fit by REML t-tests 
use Satterthwaite approximations to 
degrees of freedom [merModLmerTest]
Formula: measure ~ time + (1 | subject.id)
   Data: dat

REML criterion at convergence: 9639.6

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-2.45027 -0.70596  0.00832  0.65951  2.78759 

Random effects:
 Groups     Name        Variance Std.Dev.
 subject.id (Intercept)  9.289   3.048   
 Residual               28.860   5.372   
Number of obs: 1500, groups:  subject.id, 500

Fixed effects:
             Estimate Std. Error        df t value Pr(>|t|)    
(Intercept)   29.8508     0.3915 1449.4000   76.25   <2e-16 ***
time          -2.2420     0.1699  999.0000  -13.20   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Correlation of Fixed Effects:
     (Intr)
time -0.868

1.2. Calculating 95% CI and PI

Sometimes we want to plot the average value on top of the individual trajectories. To show the uncertainty associated with the averages, we’d need to use the fitted model to calculate the fitted values, the 95% confidence intervals (CI), the 95% prediction intervals (PI).

# See for details: http://glmm.wikidot.com/faq
dat.new <- data.frame(time=1:3)
dat.new$measure <- predict(m, dat.new, re.form=NA)
m.mat <- model.matrix(terms(m), dat.new)
dat.new$var <- diag(m.mat %*% vcov(m) %*% t(m.mat)) + VarCorr(m)$subject.id[1]
dat.new$pvar <- dat.new$var + sigma(m)^2
dat.new$ci.lb <- with(dat.new, measure - 1.96*sqrt(var))
dat.new$ci.ub <- with(dat.new, measure + 1.96*sqrt(var))
dat.new$pi.lb <- with(dat.new, measure - 1.96*sqrt(pvar))
dat.new$pi.ub <- with(dat.new, measure + 1.96*sqrt(pvar))

The first line of the code specify the time points that we want the average values, which are simply time 1, 2, and 3 in our case. The second line use predict() function to get the average values from the model ignoring the conditional random effects (re.form=NA). Lines 3 and 4 calculate the variances of the average values. It’s basically the matrix cross-products plus the variance of the random intercept. Line 5 calculates the variances of a single observation, which is the variance of the average values plus the residual variance. Lines 5 to 8 are just the standard calculation of 95% CIs and PIs assuming normal distribution. And here is the data-set for plotting the figure:

dat.new
time  measure      var     pvar    ci.lb    ci.ub     pi.lb    pi.ub
   1 27.72421 10.85669 43.04054 21.26611 34.18231 14.865574 40.58285
   2 25.22342 10.82451 43.00835 18.77490 31.67194 12.369592 38.07725
   3 22.72263 10.85669 43.04054 16.26453 29.18073  9.863993 35.58127

2. Plot the average values

Finally, let’s plot the averages with 95% CIs and PIs. Notice that the PIs are much wider than the CIs. That means we’re much more confident in predicting the average than a single value.

ggplot(data=dat.new, aes(x=time, y=measure)) + 
  geom_line(data=dat, alpha=.02, aes(group=subject.id)) + 
  geom_errorbar(width=.02, colour='red', 
                aes(x=time-.02, ymax=ci.ub, ymin=ci.lb)) +
  geom_line(colour='red', linetype='dashed', aes(x=time-.02)) + 
  geom_point(size=3.5, colour='red', fill='white', aes(x=time-.02)) +   
  geom_errorbar(width=.02, colour='blue', 
                aes(x=time+.02, ymax=pi.ub, ymin=pi.lb)) +
  geom_line(colour='blue', linetype='dashed', aes(x=time+.02)) + 
  geom_point(size=3.5, colour='blue', fill='white', aes(x=time+.02)) + 
  theme_bw()

Here the plot:

If you’re as data-sensitive as me, you should have probably noticed the fitted values aren’t quite fit to the actual data. This happens because the model is not specified well. There’re at least two ways to specify the models better, do you know what are they? We’ll talk about it in the next posts. Stay tuned.