Barr et al’s well-intentioned paper is starting to lead to some seriously weird behavior in psycholinguistics! As a reviewer, I’m seeing submissions where people take the following approach:
1. Try to fit a “maximal” linear mixed model. If you get a convergence failure (this happens a lot since we routinely run low power studies!), move to step 2.
By the way, the word maximal is ambiguous here, because you can have a “maximal” model with no correlation parameters estimated, or have one with correlations estimated. For a 2×2 design, the difference would look like:
correlations estimated: (1+factor1+factor2+interaction|subject) etc.
no correlations estimated: (factor1+factor2+interaction || subject) etc.
Both options can be considered maximal.]
2. Fit a repeated measures ANOVA. This means that you average over items to get F1 scores in the by-subject ANOVA. But this is cheating and amounts to p-value hacking. This effectively changes the between items variance to 0 because we aggregated over items for each subject in each condition. That is the whole reason why linear mixed models are so important; we can take both between item and between subject variance into account simultaneously. People mistakenly think that the linear mixed model and rmANOVA are exactly identical. If your experiment design calls for crossed varying intercepts and varying slopes (and it always does in psycholinguistics), an rmANOVA is not identical to the LMM, for the reason I give above. In the old days we used to compute minF. In 2014, I mean, 2015, it makes no sense to do that if you have a tool like lmer.
As always, I’m happy to get comments on this.