**Minding the Brain**, and kindly contributed to R-bloggers)

*p*-values is one of the most commonly asked questions about multilevel regression. The key issue is that the degrees of freedom are not trivial to compute for multilevel regression. Various detailed discussions can be found on the R-wiki and R-help mailing list post by Doug Bates. I have experimented with three methods that I think are reasonable.

1. Use the normal approximation. Since the

*t*distribution converges to the

*z*distribution as degrees of freedom increase, this is like assuming infinite degrees of freedom. This is unambiguously anti-conservative, but for reasonable sample sizes, it appears not to be very anti-conservative (Barr et al., 2013). That is, if we take the

*p*-value to measure the probability of a false positive, this approximation produces a somewhat (but perhaps not alarmingly) higher false positive rate than the nominal 5% at

*p*= 0.05.

Here is an example using proportions of semantic errors in picture naming by different aphasia subtypes (the data file can be found here):

load("Examples.RData")

require(lme4)

# fit the model

m.sem <- lmer(Semantic.error ~ TestTime * Diagnosis + (TestTime | SubjectID),

data = NamingRecovery, REML = FALSE)

# extract coefficients

coefs <- data.frame(coef(summary(m.sem)))

# use normal distribution to approximate p-value

coefs$p.z <- 2 * (1 - pnorm(abs(coefs$t.value)))

coefs

## Estimate Std..Error t.value p.z

## (Intercept) 0.045767 0.007776 5.8855 3.970e-09

## TestTime -0.008685 0.003524 -2.4644 1.372e-02

## DiagnosisConduction -0.015149 0.010039 -1.5090 1.313e-01

## DiagnosisWernicke -0.004899 0.010287 -0.4762 6.339e-01

## TestTime:DiagnosisConduction 0.007308 0.004550 1.6063 1.082e-01

## TestTime:DiagnosisWernicke 0.012854 0.004662 2.7571 5.832e-03

2. Use the Satterthwaite approximation, which is implemented in the lmerTest package. According to the documentation, this is based on SAS proc mixed theory. The lmerTest package overloads the lmer function, so you can just re-fit the model using exactly the same code, but the summary() will now include approximate degrees of freedom and *p*-values. This implementation is extremely easy to use, but can be a little maddening if you forget whether your model is a an object of type lmerMod or merModLmerTest.

require(lmerTest)

# re-fit model

m.semTest <- lmer(Semantic.error ~ TestTime * Diagnosis + (TestTime | SubjectID),

data = NamingRecovery, REML = FALSE)

# get Satterthwaite-approximated degrees of freedom

coefs$df.Satt <- coef(summary(m.semTest))[, 3]

# get approximate p-values

coefs$p.Satt <- coef(summary(m.semTest))[, 5]

coefs

## Estimate Std..Error t.value p.z

## (Intercept) 0.045767 0.007776 5.8855 3.970e-09

## TestTime -0.008685 0.003524 -2.4644 1.372e-02

## DiagnosisConduction -0.015149 0.010039 -1.5090 1.313e-01

## DiagnosisWernicke -0.004899 0.010287 -0.4762 6.339e-01

## TestTime:DiagnosisConduction 0.007308 0.004550 1.6063 1.082e-01

## TestTime:DiagnosisWernicke 0.012854 0.004662 2.7571 5.832e-03

## df.Satt p.Satt

## (Intercept) 23.00 5.344e-06

## TestTime 22.99 2.163e-02

## DiagnosisConduction 23.00 1.449e-01

## DiagnosisWernicke 23.00 6.384e-01

## TestTime:DiagnosisConduction 22.99 1.219e-01

## TestTime:DiagnosisWernicke 22.99 1.122e-02

3. Use the Kenward-Roger approximation to get approximate degrees of freedom and the *t*-distribution to get *p*-values, which is implemented in the pbkrtest package.

require(pbkrtest)

# get the KR-approximated degrees of freedom

df.KR <- get_ddf_Lb(m.sem, fixef(m.sem))

# get p-values from the t-distribution using the t-values and approximated

# degrees of freedom

coefs$p.KR <- 2 * (1 - pt(abs(coefs$t.value), df.KR))

coefs

## Estimate Std..Error t.value p.z

## (Intercept) 0.045767 0.007776 5.8855 3.970e-09

## TestTime -0.008685 0.003524 -2.4644 1.372e-02

## DiagnosisConduction -0.015149 0.010039 -1.5090 1.313e-01

## DiagnosisWernicke -0.004899 0.010287 -0.4762 6.339e-01

## TestTime:DiagnosisConduction 0.007308 0.004550 1.6063 1.082e-01

## TestTime:DiagnosisWernicke 0.012854 0.004662 2.7571 5.832e-03

## df.Satt p.Satt p.KR

## (Intercept) 23.00 5.344e-06 9.049e-06

## TestTime 22.99 2.163e-02 2.283e-02

## DiagnosisConduction 23.00 1.449e-01 1.468e-01

## DiagnosisWernicke 23.00 6.384e-01 6.390e-01

## TestTime:DiagnosisConduction 22.99 1.219e-01 1.238e-01

## TestTime:DiagnosisWernicke 22.99 1.122e-02 1.210e-02

Not surprisingly, the Satterthwaite and Kenward-Roger approximations produce slightly more conservative *p*-values than the normal approximation does. However, even in this not-very-large data set (24 participants, 6-9 in each of 3 groups, 5 observations for each participant) the normal approximation is only slightly less conservative than these two options. The approximate degrees of freedom are slightly higher when using the Satterthwaite approximation (23) than the Kenward-Roger approximation (20.1), which makes the latter a bit more conservative. Of course, these are not systematic tests, but it is fairly representative of a few models that I’ve compared.

**leave a comment**for the author, please follow the link and comment on his blog:

**Minding the Brain**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...