# The ‘swst’ package to print statistical results in Sweave

November 23, 2011
By

(This article was first published on » R, and kindly contributed to R-bloggers)

When I was making the slides for a lecture on using Sweave to incorporate R and LaTeX I was unpleasantly surprised at how tedious it can be to extract statistical values and print them in proper LaTeX code.

For example, consider a small toy dataset of lengths with 100 females have a normally distributed length with mean 170cm and standard deviation of 10, and 100 males with mean length of 180cm and standard deviation of 10:

R> foo <- data.frame(
length = c(rnorm(100,170,10), rnorm(100,180,10)),
sex = rep(c("female","male"),each=100))


A t-test shows that the means are different:

R> t.test(length~sex,data=foo,var.equal=TRUE)

Two Sample t-test

data:  length by sex
t = -6.8396, df = 198, p-value = 9.653e-11
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-12.715980 -7.024375
sample estimates:
mean in group female   mean in group male
170.2455           180.1157

The t.test() function returns a "htest" class object which is commonly used in R and allows us to easily extract the statistic, degrees of freedom and p-value:

R> res <- t.test(length~sex,data=foo,var.equal=TRUE)
R> res[['statistic']]
t
-6.839605
R> res[['parameter']]
df
198
R> res[['p.value']]
[1] 9.653065e-11

Great, now we can reference the statistic in our Sweave document:

Men were significantly taller than women
($t(\Sexpr{res[['parameter']]})=\Sexpr{res[['statistic']]}$,
$p=\Sexpr{res[['p.value']]}$)

This returns: "Men were significantly taller than women (t(198) = −6.83960491494726,
p = 9.65306549553569e − 11)". Obviously we need to round the values:

Men were significantly taller than women
($t(\Sexpr{res[['parameter']]})=\Sexpr{round(res[['statistic']],3)}$,
$p=\Sexpr{round(res[['p.value']],3)}$)

Which returns "Men were significantly taller than women (t(198) = −6.84, p = 0)". Better, but the p value should not be rounded to zero but rather be reported as being smaller than 0.001 or something similar if it is very small. To do this and make sure it stays dynamic an ifelse statement is needed:

Men were significantly taller than women
($t(\Sexpr{res[['parameter']]})=\Sexpr{round(res[['statistic']],3)}$,
$p \Sexpr{ifelse(res[['p.value']]<0.001,'< 0.001', paste('=',round(res[['p.value']],3)))}$)

Which returns "Men were significantly taller than women (t(198) = −6.84, p < 0.001)".

Good, but this sure was a lot of code to make this simple reference, and for some other classes extracting the statistics from the output object is also a lot harder than the "htest" class. For this reason I wrote the 'swst' package, which stands for SWeave STatistics.

'swst' has two main functions. The 'swp()' function can be used to generate proper LaTeX code with rounded numbers and inequality signs if needed given the name of a statistic, its value, optional degrees of freedom and the p-value. The 'swst()' function is an S3 generic with methods for a few commonly used object classes that extract the statistic, df, and p-value and send the results to 'swp()'.

This reduces the code to:

Men were significantly taller than women \Sexpr{swst(res)}

Which returns "Men were significantly taller than women (t(198) = −6.84, p < 0.001)". That's much less code!

I have written this package in a fairly short time and it is now very short and only supports a few objects. Help is greatly appreciated! If you know an object that needs to be implemented let me know or write your own method on:

http://github.com/SachaEpskamp/swst

http://cran.r-project.org/web/packages/swst/index.html