[This article was first published on Analysis on StatsNotebook - Simple. Powerful. Reproducible., and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The tutorial is based on R and StatsNotebook, a graphical interface for R.

Data with multilevel (hierarchical) structure are common in many area of research. In our tutorial on moderation analysis, we examine the impact of increasing alcohol tax in Queensland on alcohol consumption. We compared changes in alcohol consumption for individuals from Queensland before and after the implementation of tax increase, compared to changes in alcohol consumption in other states. The data were collected from individuals from different postcodes in different states. Therefore, individual observations are nested within postcodes. Since it is likely that individuals from the same postcodes are more similar compared to individuals from different postcodes (because of similar local drinking culture, similar level of exposure to local advertisement, etc), the assumption of independent observation is likely to be violated. Linear mixed model (also known as multilevel model and random effect model) can be used to account for the dependencies in the data.

Other examples of data with nested structure include

1. Longitudinal data – Multiple measurements taken from the same individual are not independent
2. School data – Students from the same school shared the same learning environment and school culture.
3. Treatment data – Patients of the same doctor might have more similar outcome compared to patients of different doctors.

In this tutorial, we will look at two examples

1. Sleep deprivation and reaction time
• A study based on repeated measures of reaction time over 10 days of sleep deprivation (Belenky et al. 2003). Observations are nested within individuals, thus are not independent.
• We will first run a random intercept model that accounts for the repeated measures design (dependencies in the data), and then run a random slope model that allows the effect of sleep deprivation to vary across participants.
2. Alcohol tax and alcohol consumption
• revisit our alcohol tax example in the moderation analysis tutorial. Individuals are nested within postcodes, thus the observations are not independent.

Belenky et al (2003) studied the relationship between sleep deprivation and reaction time. We have included their data in StatsNotebook as the Sleep dataset. See load example data for instruction on loading the data into StatsNotebook.

There are three variables in this dataset.

1. Reaction – Reaction time (ms)
2. Days – Number of days of sleep deprivation. On day 0, the subjects had their normal amount of sleep. Starting that night they were restricted to 3 hours of sleep per night.
3. Subject – Participant ID.

Does the number of days of sleep deprivation affect reaction time?

It is tempting to address this research question by a simple linear regression (or a correlation analysis). However, linear regression is not appropriate because the data is not independent – Reaction time measured from the same participant are likely to be more similar to reaction time measures across participants.

In this analysis, we will run two separate models – a random intercept model and a random slope model. Prior to running these models, it is always good practice to visualise the data and conduct descriptive analysis.

We can account for the dependencies in the data by including a random intercept for each individual. To run this model,

1. Click Analysis at the top
2. Click Regression and select Linear Regression (Numeric outcome) from the menu
3. In the left panel, select Reaction into Outcome, Days into Covariates and Subject into Random effect.
4. Click Code and Run, the following R code will be generated in a block on the right panel, with output from R below the codes.
library(lme4)
library(lmerTest)

res <- lmer(Reaction ~ Days + (1 | Subject),
data = currentDataset)
summary(res)
confint(res, level = 0.95, method = "Wald")

res.std <- resid(res)/sd(resid(res))
plot(res.std, ylab="Standardized Residuals")

"Outlier Test. Observations with a Bonferroni p < .05 might be considered as outliers and might need further investigation."
car::outlierTest(res)
car::infIndexPlot(res)
ggplot(as.data.frame(res.std), aes(sample = res.std)) +
geom_qq() +
geom_qq_line()

"Chan, G. and StatsNotebook Team (2020). StatsNotebook. (Version 0.1.0) [Computer Software]. Retrieved from https://www.statsnotebook.io"
"R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org"



The following are from the top section of the generated codes.

library(lme4)
library(lmerTest)

res <- lmer(Reaction ~ Days + (1 | Subject),
data = currentDataset)
summary(res)
confint(res, level = 0.95, method = "Wald")


These codes tell R to run a linear mixed model using the lmer from the lme4 library. The left side of the “~” symbol specifies the dependent variable; the right side specifies days as the independent variable. The code (1 | Subject) specifies a random intercept for each participant. The summary function is used to print out the results from the random intercept model and the confint function is used to print the model coefficients with the corresponding 95% confidence intervals.

The following is the model summary from the above codes.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Reaction ~ Days + (1 | Subject)
Data: currentDataset

REML criterion at convergence: 1786.5

Scaled residuals:
Min      1Q  Median      3Q     Max
-3.2257 -0.5529  0.0109  0.5188  4.2506

Random effects:
Groups   Name        Variance Std.Dev.
Subject  (Intercept) 1378.2   37.12
Residual              960.5   30.99
Number of obs: 180, groups:  Subject, 18

Fixed effects:
Estimate Std. Error       df t value Pr(>|t|)
(Intercept) 251.4051     9.7467  22.8102   25.79   <2e-16 ***
Days         10.4673     0.8042 161.0000   13.02   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
(Intr)
Days -0.371

######################################################
2.5 %    97.5 %
.sig01              NA        NA
.sigma              NA        NA
(Intercept) 232.301892 270.50832
Days          8.891041  12.04353


From the fixed effects section of the model summary, we can conclude that there is strong evidence that number of days of sleep deprivation significantly increased reaction time under a significance level of 0.05. On average, for each additional day of sleep deprivation, the reaction time increased by 10.47ms (b = 10.47, SE = 0.80, p < .001).

Fixed effects:
Estimate Std. Error       df t value Pr(>|t|)
(Intercept) 251.4051     9.7467  22.8102   25.79   <2e-16 ***
Days         10.4673     0.8042 161.0000   13.02   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1



We are 95% confident that the average increase was between 8.89 and 12.04.

                 2.5 %    97.5 %
.sig01              NA        NA
.sigma              NA        NA
(Intercept) 232.301892 270.50832
Days          8.891041  12.04353


In the above, we estimate that the average intercept across all participants is 251.4. Results from the random effects section below show that the variance of the intercept for Subject is 1378.2. Taking the square root, the standard deviation of the intercept is thus 37.1.

Random effects:
Groups   Name        Variance Std.Dev.
Subject  (Intercept) 1378.2   37.12
Residual              960.5   30.99
Number of obs: 180, groups:  Subject, 18


We can calculate the 95% coverage interval as 251.4 ± 1.96*37.1. The lower bound of the 95% coverage interval is thus 178.7 and the upper bound is 324.1. We therefore estimated that 95% of the participants have an intercept between 178.7 and 324.1. This means that 95% of the participants have a reaction time between 178.7 and 324.1 at Day 0. This is not to be confused with the 95% confidence interval of the intercept. The 95% confidence interval is (232.3, 270.5), and this indicates that we are 95% confident that the average intercept is somewhere between 232.3 and 270.5.

In this model, we have accounted for the repeated measures design (observations nested within individuals) by including a random intercept for each participants. Each individual has his/her own intercept. The effect of sleep deprivation on reaction time is assumed to be the same across individuals. This assumption can be relaxed by fitting a random slope model.

Before running the random slope model, we can store the estimates from the random intercept model into a variable for future model comparison. By default, StatsNotebook stores the model estimates into the variable res. We now store these estimates into the variable m1. Enter the following code into a new block on the right panel and run it. This line of code must be run right after running the random intercept model.

m1 <- res


In the random intercept model, we assume that the effect of sleep deprivation on reaction time is the same across individuals. The following shows the scatterplot between reaction time and days of sleep deprivation for each participants.

While for most of the participants, the association is positive (the line of best fit going up towards the top right corner of each plot), there is substantial variations - For some participants, the effect of sleep deprivation seems to be stronger (e.g. participant 308) others relatively weaker (e.g. participant 309). To account for these variations, we can run a random slope model allowing the effect of sleep deprivation to vary across participants.

To run a random slope model,

1. Click Analysis at the top
2. Click Regression and select Linear Regression (Numeric outcome) from the menu
3. In the left panel, select Reaction into Outcome, Days into Covariates and Subject into Random effect.

The above three steps are the same as running a random intercept model. To specify a random slope for Days for each Subject,

1. Expand the panel Random Effect
2. Check Days and select it into Random slope
3. Click Code and Run, the following R code will be generated in a block on the right panel, with output from R below the codes.
library(lme4)
library(lmerTest)

res <- lmer(Reaction ~ Days + (1 + Days | Subject),
data = currentDataset)
summary(res)
confint(res, level = 0.95, method = "Wald")

res.std <- resid(res)/sd(resid(res))
plot(res.std, ylab="Standardized Residuals")

"Outlier Test. Observations with a Bonferroni p < .05 might be considered as outliers and might need further investigation."
car::outlierTest(res)
car::infIndexPlot(res)
ggplot(as.data.frame(res.std), aes(sample = res.std)) +
geom_qq() +
geom_qq_line()

"Chan, G. and StatsNotebook Team (2020). StatsNotebook. (Version 0.1.0) [Computer Software]. Retrieved from https://www.statsnotebook.io"
"R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org"



The followings are from the top section of the generated codes.

library(lme4)
library(lmerTest)

res <- lmer(Reaction ~ Days + (1 + Days | Subject),
data = currentDataset)
summary(res)
confint(res, level = 0.95, method = "Wald")


These codes are very similar to those for the random intercept model, except that now we use the code (1 + Days | Subject) to specify a random intercept for each Subject, and a random slope of Days for each Subject. This allows both the intercept and the slope of days to vary across participants.

The following is the model summary from the above codes.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: Reaction ~ Days + (1 + Days | Subject)
Data: currentDataset

REML criterion at convergence: 1743.6

Scaled residuals:
Min      1Q  Median      3Q     Max
-3.9536 -0.4634  0.0231  0.4634  5.1793

Random effects:
Groups   Name        Variance Std.Dev. Corr
Subject  (Intercept) 612.10   24.741
Days         35.07    5.922   0.07
Residual             654.94   25.592
Number of obs: 180, groups:  Subject, 18

Fixed effects:
Estimate Std. Error      df t value Pr(>|t|)
(Intercept)  251.405      6.825  17.000  36.838  < 2e-16 ***
Days          10.467      1.546  17.000   6.771 3.26e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Correlation of Fixed Effects:
(Intr)
Days -0.138

######################################################
2.5 %    97.5 %
.sig01              NA        NA
.sig02              NA        NA
.sig03              NA        NA
.sigma              NA        NA
(Intercept) 238.029141 264.78107
Days          7.437594  13.49698


From the fixed effects section of the model summary, we can conclude that there is strong evidence that number of days of sleep deprivation significantly increased reaction time under a significance level of 0.05. On average, for each additional day of sleep deprivation, reaction time increased by 10.47ms (b = 10.47, SE = 1.55, p < .001).

Fixed effects:
Estimate Std. Error      df t value Pr(>|t|)
(Intercept)  251.405      6.825  17.000  36.838  < 2e-16 ***
Days          10.467      1.546  17.000   6.771 3.26e-06 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


We are 95% confident that the average increase was between 7.44 and 13.50.

                 2.5 %    97.5 %
.sig01              NA        NA
.sig02              NA        NA
.sig03              NA        NA
.sigma              NA        NA
(Intercept) 238.029141 264.78107
Days          7.437594  13.49698


From the fixed effects section, we estimated the average intercept and average effect of days of sleep deprivation (slopes of Days). We also estimated the corresponding 95% confidence intervals. Using the results from the random effects section, we can calculate the coverage intervals for both the intercept and the slope.

Random effects:
Groups   Name        Variance Std.Dev. Corr
Subject  (Intercept) 612.10   24.741
Days         35.07    5.922   0.07
Residual             654.94   25.592
Number of obs: 180, groups:  Subject, 18


The variance of the intercept and for the slope of Days is 612.1 and 35.1 respectively. Taking their square root, the corresponding standard deviations are 24.7 and 5.9. The 95% coverage interval of the intercept is 251.4 ± 1.96*24.7. The lower bound is 203.0 and the upper bound is 299.8. It is therefore estimated that 95% of the participants had a reaction time between 203.0 and 299.8 at Day 0.

The 95% coverage interval of the slope of Days is 10.5 ± 1.96*5.9. The lower bound is -1.1 and the upper bound is 22.1. It is therefore estimated that for 95% of the participants, the effect of each additional day of sleep deprivation was between -1.1 and 22.1. This is not to be confused with the 95% confidence interval (7.4 and 13.5), which means that we are 95% confident that average effect (across all participants) of each additional day of sleep deprivation on reaction time was between 7.4 and 13.5.

A random intercept model is more parsimonious. Random slope model is more flexible as it allows the effect of days of sleep deprivation to vary across participants. So which one should we use? We can test the model fit of these two models and select the one that fits the data better. After running the random intercept model, we stored the model estimates into m1. We now store the model estimates from the random slope model into m2.

The following line of code must be run right after the random slope model.

m2 <- res


We can now compare the model fit of the random intercept model (stored in m1) and random slope model (stored in m2) with a likelihood ratio test using the following line of code.

anova(m1, m2)


The above code produces the following results.

Data: currentDataset
Models:
m1: Reaction ~ Days + (1 | Subject)
m2: Reaction ~ Days + (1 + Days | Subject)
npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)
m1    4 1802.1 1814.8 -897.04   1794.1
m2    6 1763.9 1783.1 -875.97   1751.9 42.139  2  7.072e-10 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Results from the likelihood ratio test shows that the random slope model fits the data significantly better than the random intercept model, χ2(2) = 42.1, p < .001. This indicates that there is significant variation (between participants) in the effect of days of sleep deprivation on reaction time.

Table 1. Results from random slope model.

Fixed component b 95% CI p
Intercept 251.41 (238.03, 264.78) < .001
Days 10.47 (7.44, 13.50) < .001
Random component Variance
Intercept 612.10
Days 35.07
Residual 354.94

A random slope model is used to test if sleep deprivation affects reaction time. To account for the repeated measures design, a random intercept was specified for participants. The random slope for days of sleep deprivation was included in the model to allow the effect of sleep deprivation to vary across participants. Results are shown in Table 1. Using a significant level of 0.05, results indicate that sleep deprivation significantly increased reaction time. On average, each additional day of sleep deprivation increased reaction time by 10.47ms (b = 10.47, 95% CI = [7.44, 13.50], p < .001). Model fit comparison between model with and without random slope for sleep deprivation shows that the effect of sleep deprivation varied across participants, χ2(2)= 42.14, p < .001. The 95% coverage interval for the random slope of sleep deprivation is (-1.14, 22.07), indicating that the effect of sleep deprivation was between -1.14 and 22.07 for 95% of the participants.

In our alcohol tax example in the moderation analysis tutorial, we investigated two research questions in the following scenario.

Suppose that in Australia, Queensland implemented an increase in alcohol tax in 2018 and no change in tax in any other states (e.g. Victoria and New South Wales). We have collected data on alcohol consumption in 2017 and 2019, and we want to test if the change in alcohol consumption (measured in number of standard drinks consumed per month) in Queensland was different from other states.

• A simple before/ after comparison using only Queensland data is not sufficient to test if the tax increase changed alcohol consumption because there is no comparision group. A before/ after difference can simply be due to a decreasing trend. We need to test if the change of alcohol consumption in Queensland is larger than other states.
• In this analysis, we will test the interaction between year and state. (Does state change the effect of year on alcohol consumption?)

We have used the built-in Alcohol dataset in that example and the dataset can be loaded into StatsNotebook using the instructions provided here.

There are 5 variables in this dataset.

1. Alcohol - Number of standard drinks consumed per month.
2. Year - 2017 (before tax increase in Queeland) and 2019 (after tax increase)
3. Postcode - Postcode of participants’ residence
4. State - Seven states/territories in Australia, namely, Queensland, New South Wales, Northern Territory, South Australia, Tasmania, Victoria and Western Australia
5. Remoteness - Capital city or regional area.

In our original example, we have performed the moderation analysis using linear regression. However, it is likely that drinking culture differs between different local communities. Therefore, individuals from the same postcode might be more similar than individuals from different postcodes, and this would violate the assumption of independent observation.

In this tutorial, we revisit this example and fit a random intercept model to account for the dependencies due to participants residing in the same postcode.

To fit a random intercept model,

1. Click Analysis at the top
2. Click Regression and select Linear regression (Numeric outcome) from the menu
3. In the left panel, select alcohol into Outcome, and select Year, State and Remoteness into Covariates.
• We have added Remoteness into this analysis to adjust for its effect.
• We will need to code Year, State and Remoteness into factor because they are categorical variables. See Converting variable type for a step-by-step guide.
• For this analysis, because we compared Queensland against all other states, we set QLD as the reference level of State. See Setting reference level for categorical variable for step-by-step guide.
4. Select Postcode into Random Effect.
1. Expand the panel Add interaction terms, select State and Year into Interaction terms.
• The order of selected variables will determine the horizontal axis of the interaction plot (see interpretaion below). The first selected variable will always go to the horizontal axis. In this example, we click State first followed by Year.
1. Expand the panel Estimated Marginal Means, select State*Year on the variable list, and select Pairwise comparison and Interaction plot.
1. Click Code and Run, the following R code will be generated in a block on the right panel, with output from R below the codes.
library(lme4)
library(lmerTest)

res <- lmer(alcohol ~ State + Remoteness + Year + State*Year + (1 | Postcode),
data = currentDataset)
summary(res)
confint(res, level = 0.95, method = "Wald")

res.std <- resid(res)/sd(resid(res))
plot(res.std, ylab="Standardized Residuals")

"Outlier Test. Observations with a Bonferroni p < .05 might be considered as outliers and might need further investigation."
car::outlierTest(res)
car::infIndexPlot(res)
ggplot(as.data.frame(res.std), aes(sample = res.std)) +
geom_qq() +
geom_qq_line()

"Variance inflation factor (VIF >=5 indicates high level of multicollinearity)"
car::vif(res)

"Estimated Marginal Means"
library(emmeans)

emm <- emmeans(res, pairwise ~ State*Year, level = 0.95)
summary(emm)

eff_size(emm, sigma = sigma(res), edf = Inf)

emmip(res, Year ~ State,
cov.keep = 3, at = list(),
CIs = TRUE, level = 0.95, position = "jitter")

"Chan, G. and StatsNotebook Team (2020). StatsNotebook. (Version 0.1.0) [Computer Software]. Retrieved from https://www.statsnotebook.io"
"R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org"



The following are from the top section of the generated codes.

library(lme4)
library(lmerTest)

res <- lmer(alcohol ~ State + Remoteness + Year + State*Year + (1 | Postcode),
data = currentDataset)
summary(res)
confint(res, level = 0.95, method = "Wald")


These codes tell R to run a linear mixed model using the lmer from the lme4 library. The left side of the “~” symbol specifies the alcohol as the dependent variable; the right side specifices State, Remoteness, Year and the interaction of State and Year as the independent variable. The code (1 | Postcode) specifies a random intercept for each Postcode. The summary function is used to print out the results from the random intercept model.

The following is the model summary from the above codes.

Linear mixed model fit by REML. t-tests use Satterthwaite's method [
lmerModLmerTest]
Formula: alcohol ~ State + Remoteness + Year + State * Year + (1 | Postcode)
Data: currentDataset

REML criterion at convergence: 38231.5

Scaled residuals:
Min      1Q  Median      3Q     Max
-1.9021 -0.7711 -0.1916  0.5797  4.2157

Random effects:
Groups   Name        Variance Std.Dev.
Postcode (Intercept)   59.24   7.697
Residual             2001.97  44.743
Number of obs: 3666, groups:  Postcode, 35

Fixed effects:
Estimate Std. Error       df t value Pr(>|t|)
(Intercept)          72.639      6.988   28.022  10.395 4.03e-11 ***
StateNSW              2.605      7.652   30.117   0.340  0.73593
StateNT               1.689      8.302   31.530   0.203  0.84011
StateSA              -6.905      7.950   30.890  -0.869  0.39179
StateTAS              2.193      8.889   33.801   0.247  0.80663
StateVIC             -9.671      7.967   26.721  -1.214  0.23540
StateWA             -10.466      8.785   41.141  -1.191  0.24036
RemotenessRegional  -10.373      4.502   23.524  -2.304  0.03036 *
Year2019            -21.028      5.060 3635.689  -4.156 3.32e-05 ***
StateNSW:Year2019    16.702      6.051 3650.901   2.760  0.00581 **
StateNT:Year2019     14.589      6.894 3201.952   2.116  0.03440 *
StateSA:Year2019     18.909      6.526 3649.459   2.898  0.00378 **
StateTAS:Year2019     6.764      7.438 2549.967   0.909  0.36324
StateVIC:Year2019    15.187      5.953 3634.067   2.551  0.01078 *
StateWA:Year2019     22.251      7.019 3644.024   3.170  0.00154 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1


Using a significance level of 0.05, most of the interaction term between Year and States are significant. This indicates that changes in alcohol consumption in Queensland is different from the changes in other states. However, interpreting these coefficients is not straightforward.

An easier way to interpret interactions is to calculate the Estimated Margainal Means (EMMs). We can estimate the mean alcohol consumption by State and Year from our regression model.

"Estimated Marginal Means"
library(emmeans)

emm <- emmeans(res, pairwise ~ State*Year, level = 0.95)
summary(emm)

eff_size(emm, sigma = sigma(res), edf = Inf)



These few lines of codes will produce a long list of output that can be broken down into three major section.

Estimated Marginal Means

$emmeans State Year emmean SE df asymp.LCL asymp.UCL QLD 2017 67.5 6.64 Inf 54.4 80.5 NSW 2017 70.1 3.80 Inf 62.6 77.5 NT 2017 69.1 4.94 Inf 59.5 78.8 SA 2017 60.5 4.33 Inf 52.1 69.0 TAS 2017 69.6 5.87 Inf 58.1 81.2 VIC 2017 57.8 4.40 Inf 49.2 66.4 WA 2017 57.0 5.71 Inf 45.8 68.2 QLD 2019 46.4 6.45 Inf 33.8 59.1 NSW 2019 65.7 4.07 Inf 57.8 73.7 NT 2019 62.7 5.42 Inf 52.1 73.3 SA 2019 58.4 4.88 Inf 48.9 68.0 TAS 2019 55.4 5.92 Inf 43.8 67.0 VIC 2019 51.9 4.60 Inf 42.9 61.0 WA 2019 58.2 4.57 Inf 49.3 67.2 Results are averaged over the levels of: Remoteness Degrees-of-freedom method: asymptotic Confidence level used: 0.95  The above table contains estimates for alcohol consumption by State and Year, with 95% confidence intervals. For example, it is estimated that on average, participants in Queensland in 2017 consumed 67.5 standard drinks per month, with a 95% confidence interval of (54.4, 80.5). It should be noted that the estimates from this random intercept model is very similar to those from a moderation analysis using linear regression. However, the confidence intervals of the random intercept model is wider. This table is then followed by a long table of pairwise comparisons. However, we will not be interested in most of these comparisons. Pairwise comparison $contrasts
contrast            estimate   SE  df z.ratio p.value
QLD 2017 - NSW 2017   -2.604 7.65 Inf -0.340  1.0000
QLD 2017 - NT 2017    -1.689 8.30 Inf -0.203  1.0000
QLD 2017 - SA 2017     6.905 7.95 Inf  0.869  0.9999
.
.
.
TAS 2019 - VIC 2019    3.441 7.55 Inf  0.456  1.0000
TAS 2019 - WA 2019    -2.829 6.77 Inf -0.418  1.0000
VIC 2019 - WA 2019    -6.270 6.54 Inf -0.958  0.9996


We will only be interested in the 2017 and 2019 comparison in each state.

contrast            estimate   SE  df z.ratio p.value
QLD 2017 - QLD 2019   21.028 5.06 Inf  4.156  0.0026
NSW 2017 - NSW 2019    4.326 3.32 Inf  1.302  0.9907
NT 2017 - NT 2019      6.439 4.68 Inf  1.375  0.9847
SA 2017 - SA 2019      2.119 4.12 Inf  0.514  1.0000
TAS 2017 - TAS 2019   14.264 5.45 Inf  2.616  0.3296
VIC 2017 - VIC 2019    5.841 3.14 Inf  1.863  0.8467
WA 2017 - WA 2019     -1.223 4.86 Inf -0.251  1.0000


These results indicate that there is significant change in Queensland between 2017 and 2019 only. On average, consumption decreased by 21.03 standard drinks. The p-value is adjusted for a family of 14 estimates using Tukey's method. This adjustment is necessary to protect against increased rate of false positives due to multiple testing.

Lastly, the pairwise comparisons are accompanied by a table of effect sizes.

Effect size

contrast              effect.size     SE  df asymp.LCL asymp.UCL
(QLD 2017 - NSW 2017)    -0.05821 0.1710 Inf  -0.39339   0.27697
(QLD 2017 - NT 2017)     -0.03774 0.1855 Inf  -0.40140   0.32591
(QLD 2017 - SA 2017)      0.15433 0.1777 Inf  -0.19393   0.50258
.
.
.
(TAS 2019 - VIC 2019)     0.07691 0.1688 Inf  -0.25391   0.40773
(TAS 2019 - WA 2019)     -0.06322 0.1514 Inf  -0.35988   0.23344
(VIC 2019 - WA 2019)     -0.14012 0.1463 Inf  -0.42681   0.14657


Similarly, we will be only interested in the 2017 and 2019 comparison in each state.

(QLD 2017 - QLD 2019)     0.46998 0.1131 Inf   0.24832   0.69164
(NSW 2017 - NSW 2019)     0.09669 0.0743 Inf  -0.04891   0.24228
(NT 2017 - NT 2019)       0.14391 0.1046 Inf  -0.06119   0.34900
(SA 2017 - SA 2019)       0.04736 0.0921 Inf  -0.13307   0.22779
(TAS 2017 - TAS 2019)     0.31880 0.1218 Inf   0.07998   0.55762
(VIC 2017 - VIC 2019)     0.13055 0.0701 Inf  -0.00682   0.26793
(WA 2017 - WA 2019)      -0.02733 0.1087 Inf  -0.24039   0.18572


These effect sizes can be interpreted according to Cohen’s rule: 0.2 represents a small effect size, 0.5 represents a medium effect size and 0.8 represents a large effect size.

The emmeans package provides a handy function emmip to visualise the interaction.

emmip(res, Year ~ State,
cov.keep = 3, at = list(),
CIs = TRUE, level = 0.95, position = "jitter")

"Chan, G. and StatsNotebook Team (2020). StatsNotebook. (Version 0.1.0) [Computer Software]. Retrieved from https://www.statsnotebook.io"
"R Core Team (2020). The R Project for Statistical Computing. [Computer software]. Retrieved from https://r-project.org"