Mixed models exercise 2. Repeated measurements

September 1, 2013
By

(This article was first published on Wiekvoet, and kindly contributed to R-bloggers)

Continuing my exploration of mixed models, I now understand what is happening in the second SAS(R)/STAT example for proc mixed (page 5007 of the SAS/STAT 12.3 Manual). It is all about correlation between the time-points within subjects. The data as such is simple, size measurements of children at ages 8, 10, 12 and 14. The subject of analysis is growth. There is correlation between the data at different ages. Then there are the usual subjects; Random subjects in intercept, fixed effects for gender and age effect and the interaction age*gender.
To summarize the results. With lme I get the fixed structure of the first two models, but the R structure has too high values. With MCMCglm I only was able to create the first model. Here the R structure was quite different, but equivalent to yet another lme model. For interpretation purposes, the fixed model parts are the same. The difference between the two lme models escapes me for now.  

Data

Data from SAS/STAT guide. To have the base levels as the SAS analysis there is a relevel for Gender, other than that it is straightforward reading data and a reshape().
r1 <- read.table(textConnection('
1 F 21.0 20.0 21.5 23.0
2 F 21.0 21.5 24.0 25.5
3 F 20.5 24.0 24.5 26.0
4 F 23.5 24.5 25.0 26.5
5 F 21.5 23.0 22.5 23.5
6 F 20.0 21.0 21.0 22.5
7 F 21.5 22.5 23.0 25.0
8 F 23.0 23.0 23.5 24.0
9 F 20.0 21.0 22.0 21.5
10 F 16.5 19.0 19.0 19.5
11 F 24.5 25.0 28.0 28.0
12 M 26.0 25.0 29.0 31.0
13 M 21.5 22.5 23.0 26.5
14 M 23.0 22.5 24.0 27.5
15 M 25.5 27.5 26.5 27.0
16 M 20.0 23.5 22.5 26.0
17 M 24.5 25.5 27.0 28.5
18 M 22.0 22.0 24.5 26.5
19 M 24.0 21.5 24.5 25.5
20 M 23.0 20.5 31.0 26.0
21 M 27.5 28.0 31.0 31.5
22 M 23.0 23.0 23.5 25.0
23 M 21.5 23.5 24.0 28.0
24 M 17.0 24.5 26.0 29.5
25 M 22.5 25.5 25.5 26.0
26 M 23.0 24.5 26.0 30.0
27 M 22.0 21.5 23.5 25.0
'),col.names=c('Person','Gender','Age8','Age10','Age12','Age14'),
colClasses=c('factor','factor',rep('numeric',4)))
rm <- reshape(r1,direction='long',
    varying=list(c('Age8','Age10','Age12','Age14')),
    timevar='Age',idvar=c('Person','Gender'),
    v.names='y',
    times=c(8,10,12,14))
rm$Gender <- relevel(rm$Gender,ref='M')
rm$fage=factor(rm$Age)

nlme

Ignoring correlation, the analysis is simple. However, this is not the analysis which is presented in the STAT Users Guide.
lNull <- lme(y ~  Age * Gender,data=rm, random= ~ 1 | Person,
    method='REML')
summary(lNull)
Linear mixed-effects model fit by REML
 Data: rm 
       AIC      BIC    logLik
  445.7572 461.6236 -216.8786

Random effects:
 Formula: ~1 | Person
        (Intercept) Residual
StdDev:    1.816214 1.386382

Fixed effects: y ~ Age * Gender 
                Value Std.Error DF   t-value p-value
(Intercept) 16.340625 0.9813122 79 16.651810  0.0000
Age          0.784375 0.0775011 79 10.120823  0.0000
GenderF      1.032102 1.5374208 25  0.671321  0.5082
Age:GenderF -0.304830 0.1214209 79 -2.510520  0.0141
 Correlation: 
            (Intr) Age    GendrF
Age         -0.869              
GenderF     -0.638  0.555       
Age:GenderF  0.555 -0.638 -0.869

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-3.59804400 -0.45461690  0.01578365  0.50244658  3.68620792 

Number of Observations: 108
Number of Groups: 27 

First Model

The first model in the guide should be general symmetric in R structure. While I first modeled this in the correlation term (see below), I ended up building this in the random term. The current model has fixed effects exactly like PROC MIXED, associated test very close, but the R matrix is twice as large.
lSymm <- lme(y ~  Age * Gender,
    data=rm, random= list(Person =pdSymm(~ fage-1)),method='ML')
summary(lSymm)
Linear mixed-effects model fit by maximum likelihood
 Data: rm 
      AIC     BIC    logLik
  449.477 489.709 -209.7385

Random effects:
 Formula: ~fage - 1 | Person
 Structure: General positive-definite
         StdDev    Corr               
fage8    2.1650807 fage8 fage10 fage12
fage10   1.8698474 0.603              
fage12   2.3554563 0.708 0.617        
fage14   2.0460614 0.569 0.800  0.793 
Residual 0.6569738                    

Fixed effects: y ~ Age * Gender 
                Value Std.Error DF   t-value p-value
(Intercept) 15.842297 0.9534264 79 16.616171  0.0000
Age          0.826803 0.0806212 79 10.255400  0.0000
GenderF      1.583072 1.4937321 25  1.059810  0.2994
Age:GenderF -0.350438 0.1263091 79 -2.774446  0.0069
 Correlation: 
            (Intr) Age    GendrF
Age         -0.875              
GenderF     -0.638  0.559       
Age:GenderF  0.559 -0.638 -0.875

Standardized Within-Group Residuals:
        Min          Q1         Med          Q3         Max 
-1.34109378 -0.30114043  0.03182357  0.26360983  1.69966216 

Number of Observations: 108
Number of Groups: 27 
as.matrix(lSymm$modelStruct$reStruct)
$Person
           fage8   fage10    fage12   fage14
fage8  10.860557 5.655273  8.365118 5.843739
fage10  5.655273 8.100582  6.296156 7.095082
fage12  8.365118 6.296156 12.854465 8.858492
fage14  5.843739 7.095082  8.858492 9.699319

Second model

The second model is compound symmetry. Again, the statements are easy, but the R values twice as large. Fixed values match the PROC MIXED example output.
lCSymm <- lme(y ~  Age * Gender,
    data=rm, random= list(Person =pdCompSymm(~ fage-1)),method='ML')
summary(lCSymm)
Linear mixed-effects model fit by maximum likelihood
 Data: rm 
       AIC     BIC    logLik
  442.6391 461.414 -214.3195

Random effects:
 Formula: ~fage - 1 | Person
 Structure: Compound Symmetry
         StdDev    Corr             
fage8    2.1024306                  
fage10   2.1024306 0.686            
fage12   2.1024306 0.686 0.686      
fage14   2.1024306 0.686 0.686 0.686
Residual 0.6963793                  

Fixed effects: y ~ Age * Gender 
                Value Std.Error DF   t-value p-value
(Intercept) 16.340625 0.9814310 79 16.649795  0.0000
Age          0.784375 0.0779963 79 10.056564  0.0000
GenderF      1.032102 1.5376069 25  0.671239  0.5082
Age:GenderF -0.304830 0.1221968 79 -2.494580  0.0147
 Correlation: 
            (Intr) Age    GendrF
Age         -0.874              
GenderF     -0.638  0.558       
Age:GenderF  0.558 -0.638 -0.874

Standardized Within-Group Residuals:
         Min           Q1          Med           Q3          Max 
-1.854861527 -0.235701022  0.007918635  0.265357548  1.898850367 

Number of Observations: 108
Number of Groups: 27 
as.matrix(lCSymm$modelStruct$reStruct)
$Person
          fage8   fage10   fage12   fage14
fage8  9.114895 6.249301 6.249301 6.249301
fage10 6.249301 9.114895 6.249301 6.249301
fage12 6.249301 6.249301 9.114895 6.249301
fage14 6.249301 6.249301 6.249301 9.114895

MCMCglmm

I ran into the limits of MCMCglmm pretty quickly. However, the first model was possible after defining a prior. (As a note, a slightly more verbose output while setting the prior would have made life a bit more easy). The R structure is quite far from PROC MIXED. 
library(MCMCglmm)
rm$fage=factor(rm$Age)
prior1 <- list(R=list(V=diag(4),nu=.01),
    G=list(G1=list(V=diag(1),nu=.01) 

m1 <- MCMCglmm(y ~ Age* Gender , 
    random= ~ Person ,
    rcov=~ us(fage) :Person,
    data=rm,family='gaussian',
    nitt=500000,thin=20,burnin=50000,
    verbose=FALSE
 ,   prior=prior1
)
summary(m1)
 Iterations = 50001:499981
 Thinning interval  = 20
 Sample size  = 22500 

 DIC: 123.8128 

 G-structure:  ~Person

       post.mean l-95% CI u-95% CI eff.samp
Person     4.209    2.136    6.963     5615

 R-structure:  ~us(fage):Person

             post.mean l-95% CI u-95% CI eff.samp
8:8.Person      2.6261  0.45693   5.5948    562.4
10:8.Person    -0.5219 -1.89671   1.0416    659.6
12:8.Person     0.3808 -1.55471   2.6806    517.4
14:8.Person    -0.7821 -2.32815   0.8618    862.4
8:10.Person    -0.5219 -1.89671   1.0416    659.6
10:10.Person    1.4867  0.06493   3.4580    493.6
12:10.Person   -0.4788 -1.89693   1.1443    756.1
14:10.Person    0.1536 -1.06686   1.7433    515.8
8:12.Person     0.3808 -1.55471   2.6806    517.4
10:12.Person   -0.4788 -1.89693   1.1443    756.1
12:12.Person    3.0773  0.55387   6.2826    546.7
14:12.Person    0.6970 -1.03654   2.9108    505.7
8:14.Person    -0.7821 -2.32815   0.8618    862.4
10:14.Person    0.1536 -1.06686   1.7433    515.8
12:14.Person    0.6970 -1.03654   2.9108    505.7
14:14.Person    1.9824  0.21537   4.3931    529.9

 Location effects: y ~ Age * Gender 

            post.mean l-95% CI u-95% CI eff.samp  pMCMC    
(Intercept)  15.85825 13.71176 18.11169    18083 <4e-05 ***
Age           0.82697  0.64584  1.00791    22500 <4e-05 ***
GenderF       1.55014 -1.84468  4.93357    22500 0.3507    
Age:GenderF  -0.35005 -0.62738 -0.06956    22500 0.0163 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

lme comparison

It is interesting to notice the R correlation is close to slightly different formulation of lme, where not the random structure, but rather the correlation is set. The fixed effects seem the same between model formulations.
lSymm <- lme(y ~  Age * Gender,corr=corSymm(form= ~ I(Age/2-3)),
    data=rm, random= ~ 1 | Person,method='ML')
summary(lSymm)

Linear mixed-effects model fit by maximum likelihood
 Data: rm 
       AIC      BIC    logLik
  445.2348 477.4204 -210.6174

Random effects:
 Formula: ~1 | Person
        (Intercept) Residual
StdDev:    1.753195 1.356352

Correlation Structure: General
 Formula: ~I(Age/2 - 3) | Person 
 Parameter estimate(s):
 Correlation: 
  1      2      3     
2 -0.203              
3  0.006 -0.192       
4 -0.313  0.322  0.214
Fixed effects: y ~ Age * Gender 
                Value Std.Error DF   t-value p-value
(Intercept) 15.920019 0.9766449 79 16.300724  0.0000
Age          0.824758 0.0810365 79 10.177603  0.0000
GenderF      1.488148 1.5301085 25  0.972577  0.3401
Age:GenderF -0.348628 0.1269599 79 -2.745971  0.0075
 Correlation: 
            (Intr) Age    GendrF
Age         -0.874              
GenderF     -0.638  0.558       
Age:GenderF  0.558 -0.638 -0.874

Standardized Within-Group Residuals:
          Min            Q1           Med            Q3           Max 
-3.2236086696 -0.5564845586  0.0009453569  0.5124328827  3.7675873271 

Number of Observations: 108
Number of Groups: 27 
as.matrix(lSymm$modelStruct$corStruct)[[2]].
             [,1]       [,2]         [,3]       [,4]
[1,]  1.000000000 -0.2027774  0.005515599 -0.3128891
[2,] -0.202777439  1.0000000 -0.192400735  0.3222584
[3,]  0.005515599 -0.1924007  1.000000000  0.2142248
[4,] -0.312889054  0.3222584  0.214224798  1.0000000
cov2cor(matrix(colMeans(m1$VCV[,-1]),nrow=4))
           [,1]        [,2]       [,3]        [,4]
[1,]  1.0000000 -0.26413228  0.1339446 -0.34279042
[2,] -0.2641323  1.00000000 -0.2238354  0.08945827
[3,]  0.1339446 -0.22383536  1.0000000  0.28219796
[4,] -0.3427904  0.08945827  0.2821980  1.00000000

To leave a comment for the author, please follow the link and comment on his blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.