Adding polished significance summaries to papers using R

October 4, 2016
By

(This article was first published on R – Win-Vector Blog, and kindly contributed to R-bloggers)

When we teach “R for statistics” to groups of scientists (who tend to be quite well informed in statistics, and just need a bit of help with R) we take the time to re-work some tests of model quality with the appropriate significance tests. We organize the lesson in terms of a larger and more detailed version of the following list:

  • To test the quality of a numeric model to numeric outcome: F-test (as in linear regression).
  • To test the quality of a numeric model to a categorical outcome: χ2 or “Chi-squared” test (as in logistic regression).
  • To test the association of a categorical predictor to a categorical outcome: many tests including Fisher’s exact test and Barnard’s test.
  • To test the quality of a categorical predictor to a numeric outcome: t-Test, ANOVA, and Tukey’s “honest significant difference” test.

The above tests are all in terms of checking model results, so we don’t allow re-scaling of the predictor as part of the test (as we would have in a Pearson correlation test, or an area under the curve test). There are, of course, many alternatives such as Wald’s test- but we try to start with a set of tests that are standard, well known, and well reported by R. An odd exception has always been the χ2 test, which we will write a bit about in this note.

The χ2 test is a very useful statistical test. In particular, under fairly mild assumptions, it is a usable probability model for the quality of fit of a logistic regression. It is based on a summary statistics called “deviance” (an odd name, but the quantity is strongly related to likelihood and to entropy). And, after a simple transform, it yields a quantity called “pseudo-R2” (see The Simpler Derivation of Logistic Regression) that reads like “fraction of variation explained.” It is great final test for well-tuned models designed to estimate probabilities (just as area under the curve is a good early test as it abstracts out scale and choice of decision threshold).

Yet the χ2 test is under-emphasized and under-implemented in R. Consider the following trivial logistic regression model.

d <- data.frame(x=c(1,2,3,4,5,6,7,7),
       y=c(TRUE,FALSE,FALSE,FALSE,TRUE,TRUE,TRUE,FALSE))
model <- glm(y~x,data=d,family=binomial)
summary(model)

## Call:
## glm(formula = y ~ x, family = binomial, data = d)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -1.37180  -1.09714  -0.00811   1.08024   1.42939  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -0.7455     1.6672  -0.447    0.655
## x             0.1702     0.3429   0.496    0.620
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 11.090  on 7  degrees of freedom
## Residual deviance: 10.837  on 6  degrees of freedom
## AIC: 14.837
## 
## Number of Fisher Scoring iterations: 4

Notice that R reported the change in deviance (the measure of model quality, also note the AIC or Akaike information criteria is present), but no significance on the overall model fit is reported (n.b. this is different from coefficient significances). The significance is probably not there because glm() is a fully general generalized linear model fitter (it can fit much more than just logistic regressions) and the error model likely changes as you change the model type (controlled by the family and link controls).

But logistic regression really deserves to be front and center. This is why in Section 7.2 of Practical Data Science with R, Nina Zumel, John Mount, Manning 2014 we take the time to define and show how to calculate the pseudo-R2 and the significance for the reported deviance.

As we observed in “Proofing statistics in papers” having standard tests and standard reporting of tests is a great advantage. In this spirit we add an “APA-like” report of the χ2 significance in our sigr package. The use is quick and decisive:

library('sigr') 
cat(formatChiSqTestFromModel(model,pLargeCutoff=1,format='html')$formatStr)


Chi-Square Test summary: pseudo-R2=0.023 (χ2(1,N=8)=0.25, p=0.615).

The sigr package isn’t up on CRAN, but can be installed using devtools::install_github('WinVector/sigr'). It includes documentation, and an example vignette. sigr can format directly to HTML, Latex, Markdown, and Word. It designed to be used with a knitr workflow, and when used with knit will automatically select the correct target formatting. Right now it generates reports for only linear and logistic regressions; we will likely fill out with a few more “most ofter used, so should have a nice neat format” tests going forward.

To leave a comment for the author, please follow the link and comment on their blog: R – Win-Vector Blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)