Statcheck: an R package to check statistical results in psychology papers

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The results of many scientific papers are wrong. There are many reasons for this, including p-hacking, publication bias, and the general inability to replicate results. But there's another, more mundane cause: incorrect calculation of p-values in statistical tests. This could be caused by simple transcription errors when plugging numbers into a statistical tool, incorrect rounding, or misapplication of the test itself (say, applying a two-sided test when a 1-sided p-value is appropriate). Such errors should be picked up in the peer review process, but given that even expert statisicians sometimes struggle to explain p-values, it's not surprising that some errors get through.

That's why Michèle B. Nuijten, a PhD student at Tilburg University, created the R package statcheck. Given a paper to be published in a psychology journal, statcheck searches for statistical results from \(t\), \(F\), \(r\), \(\chi^2\), and \(Z\) tests, and compares the published p-value to a value calculated by R. This is possible only because the American Psychological Association Style Guide has a very specific format for reporting statistical results, listing the p-value next to the reported test statistic. Statcheck also attempts to detect if the surrounding language mentions a “one-sided” or “one-tailed” test and calculates the p-value in R accordingly (although this process isn't perfect). Anyone can use statcheck by uploading a PDF or HTML version of their paper to the statcheck web application, or by using the statcheck function within R directly.

Comparison
statcheck compares reported p-values with computed p-values and reports discrepancies, noting whether the difference would have changed the outcome of the test.

Nuijten recounts the origins and development of statcheck in an interesting article in Retraction Watch. One major surprise: when they applied statcheck to p-values reported in eight major psychology journals from 1985 to 2013:

Half of the papers in psychology contain at least one statistical reporting inconsistency, and one in eight papers contain an inconsistency that might have affected the statistical conclusion.

Since then, they've further automated statcheck by automatically sharing the results of its analyses for 50,000 papers at PubPeer. Not everyone was pleased by the notifications (a former president of the Association for Psychological Science called it 'methodological terrorism'), but the process did reveal more inconsistencies in published papers.

For more on statcheck, check out its website at the link below.

Michèle B. Nuijten: R package “statcheck”: Extract statistics from articles and recompute p values (Epskamp & Nuijten, 2016)

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)