Bayesian First Aid: Binomial Test

[This article was first published on Publishable Stuff, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


The binomial test is arguably the conceptually simplest of all statistical tests: It has only one parameter and an easy to understand distribution for the data. When introducing null hypothesis significance testing it is puzzling that the binomial test is not the first example of a test but sometimes is introduced long after the t-test and the ANOVA (as here) and sometimes not introduce it at all (as here and here). When introducing a new concept, why not start with simplest example? It is not like there is a problem with students understanding the concept of null hypothesis significance testing too well. I’m not doing the same misstake so here follows the Bayesian First Aid alternative to the binomial test!

Bayesian First Aid is an attempt at implementing reasonable Bayesian alternatives to the classical hypothesis tests in R. For the rationale behind Bayesian First Aid see the original announcement. The delopment of Bayesian First Aid can be followed on GitHub. Bayesian First Aid is a work in progress and I’m grateful for any suggestion on how to improve it!

Bayesian First Aid logo box

The Model

Coming up with a Bayesian alternative for the binomial test is almost a no-brainer. The data is a count of the successes and failures in some task (where what is labeled success is an arbitrary choice most of the time). Given this impoverished data (no predictor variables and no dependency between trials) the distribution for the data has to be a binomial distribution with the number of Bernoulli trials ($n$) fixed and the relative frequency of success ($\theta$) as the only free parameter. Notice that $\theta$ is often called the probability of success but from a Bayesian perspective I believe this is almost a misnomer and, at the very least, a very confusing name. The result of a Bayesian analysis is a probability distribution over the parameters and calling $\theta$ a probability can result in hard-to-decipher statements like: “The probability that the probability is larger than 50% is more than 50%”. Calling $\theta$ the relative frequency of success also puts emphasis on that $\theta$ is a property of a process and reserves “probability” for statements about knowledge (keeping probabilities nicely “inside the agent”).

The only part of the model that requires some thought is the prior distribution for $\theta$. There seems to mainly be three priors that are proposed as non-informative distributions for $\theta$: The flat prior, Jeffrey’s’ prior and Haldanes prior. These priors are beautifully described by Zhu and Lu (2004) and the flat and Jeffrey’s are pictured below.

plot of chunk unnamed-chunk-2

The Haldane prior is trickier to visualize as it puts an infinite amount of mass at $\theta=0$ and $\theta=1$.

As explained by Zhu and Lu, the Haldane prior could be considered the least informative prior, so isn’t that what we want? Nope, I’ll go with the flat distribution which can also be considered as “lest informative” when we know that both successes and failures are possible (Lee, 2004). I believe that in most cases when binomial data is analyzed it is known (or at least suspected) a priori that both successes and failures are possible thus it makes sense to use the flat prior as the default prior.

The flat prior is conveniently constructed as a $\mathrm{Beta}(1,1)$ distribution. An $\mathrm{Uniform}(0,1)$ could equally well be used but the $\mathrm{Beta}$ has the advantage that it is easy to make it more or less informative by changing the two parameters. For example, if you want to use Jeffrey’s prior instead of the flat you would use a $\mathrm{Beta}(0.5,0.5)$ and a Haldane prior can be approximated by $\mathrm{Beta}(\epsilon,\epsilon)$ where $\epsilon$ is close to zero, say 0.0001.

The final Bayesian First Aid alternative to the binomial test is then:

Binomial test model

The diagram to the left is a Kruschke style diagram. For an easy way of making such a diagram check out this post.

The bayes.binom.test function

Having a model is fine but Bayesian First Aid is also about being able to run the model in a nice way and getting useful output. First let’s find some binomial data and use it to run a standard binomial test in R. The highly cited Nature paper Poleward shifts in geographical ranges of butterfly species associated with regional warming describes how the geographical areas of a sample of butterfly species have moved northwards, possibly as an effect of the rising global temperature. In that paper we find the following binomial data:

Parmesan(1999), Nature

Here is the corresponding binomial test in R with “retracting northwards” as success and “extending southwards” as failure:

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">binom.test(c(<span style="color: #666666">9</span>, <span style="color: #666666">2</span>))
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## 	Exact binomial test</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## data:  c(9, 2)</span>
<span style="color: #408080; font-style: italic">## number of successes = 9, number of trials = 11, p-value = 0.06543</span>
<span style="color: #408080; font-style: italic">## alternative hypothesis: true probability of success is not equal to 0.5</span>
<span style="color: #408080; font-style: italic">## 95 percent confidence interval:</span>
<span style="color: #408080; font-style: italic">##  0.4822 0.9772</span>
<span style="color: #408080; font-style: italic">## sample estimates:</span>
<span style="color: #408080; font-style: italic">## probability of success </span>
<span style="color: #408080; font-style: italic">##                 0.8182</span>
</pre></div>

Not a significant significant result at the 5% $\alpha$ level. C’est la vie.

Fun fact: The resulting p-value is not the same as that reported in the Nature paper. By closer inspection it seems like the reported p-value is correct as Table 2 shows that there actually were ten species retracting northwards and not nine, “9 retracting north” was a typo and should have read “10 retracting north”. I leave it as an excecise for the interested reader to find the other wrongly reported binomial test p-value in the paper. Despite being a typo, here I’ll continue with nine successes and two failures.

Now to run the Bayesian First Aid alternative you simply prepend bayes. to the function call.

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">library(BayesianFirstAid)
bayes.binom.test(c(<span style="color: #666666">9</span>, <span style="color: #666666">2</span>))
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## 	Bayesian first aid binomial test</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## data: c(9, 2)</span>
<span style="color: #408080; font-style: italic">## number of successes = 9, number of trials = 11</span>
<span style="color: #408080; font-style: italic">## Estimated relative frequency of success:</span>
<span style="color: #408080; font-style: italic">##   0.769 </span>
<span style="color: #408080; font-style: italic">## 95% credible interval:</span>
<span style="color: #408080; font-style: italic">##   0.554 0.964 </span>
<span style="color: #408080; font-style: italic">## The relative frequency of success is more than 0.5 by a probability of 0.981 </span>
<span style="color: #408080; font-style: italic">## and less than 0.5 by a probability of 0.019</span>
</pre></div>

I’ve chosen to make the output similar to binom.test. You get to know what data you put in, you get a point estimate (here the mean of the posterior of $\theta$) and you get a credible interval. You don’t get a p-value, you do get the probability that $\theta$ is lower or higher that a comparison value which defaults to 0.5. Looking at these probabilities and the credible interval it seems like there is pretty good evidence that butterflies are generally moving northwards.

A goal with Bayesian First Aid is complete compliance with the calling structure of the original test functions. You should be able to use the same arguments in the same way as you are used to with binom.test. For example, the call

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">binom.test(<span style="color: #666666">9</span>, <span style="color: #666666">11</span>, conf.level <span style="color: #666666">=</span> <span style="color: #666666">0.8</span>, p <span style="color: #666666">=</span> <span style="color: #666666">0.75</span>, alternative <span style="color: #666666">=</span> <span style="color: #BA2121">"greater"</span>)
</pre></div>

is handled by bayes.binom.test by using a 80% credible interval and using 0.75 as the comparison value. I haven’t figured out anything useful to do with alternative="greater" so that simply issues a warning:

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">bayes.binom.test(<span style="color: #666666">9</span>, <span style="color: #666666">11</span>, conf.level <span style="color: #666666">=</span> <span style="color: #666666">0.8</span>, p <span style="color: #666666">=</span> <span style="color: #666666">0.75</span>, alternative <span style="color: #666666">=</span> <span style="color: #BA2121">"greater"</span>)
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">## Warning: The argument 'alternative' is ignored by bayes.binom.test</span>
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## 	Bayesian first aid binomial test</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## data: 9 and 11</span>
<span style="color: #408080; font-style: italic">## number of successes = 9, number of trials = 11</span>
<span style="color: #408080; font-style: italic">## Estimated relative frequency of success:</span>
<span style="color: #408080; font-style: italic">##   0.769 </span>
<span style="color: #408080; font-style: italic">## 80% credible interval:</span>
<span style="color: #408080; font-style: italic">##   0.647 0.925 </span>
<span style="color: #408080; font-style: italic">## The relative frequency of success is more than 0.75 by a probability of 0.609 </span>
<span style="color: #408080; font-style: italic">## and less than 0.75 by a probability of 0.391</span>
</pre></div>

If you run the same call to bayes.binom.test repeatedly you’ll notice that you get slightly different values every time. This is because Markov chain Monte Carlo (MCMC) approximation is used to approximate the posterior distribution of $\theta$. The accuracy of the MCMC approximation is governed by the n.iter parameter which defaults to 15000. This should be enough for most purposes but if you feel it is not enough, just max it up!

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">bayes.binom.test(c(<span style="color: #666666">9</span>, <span style="color: #666666">2</span>), n.iter <span style="color: #666666">=</span> <span style="color: #666666">999999</span>)
</pre></div>

Generic Functions

Every Bayesian First Aid test have corresponding plot, summary, diagnostics and model.code functions.

plot plots the parameters of interest and, if appropriate, a posterior predictive distribution. In the case of bayes.binom.test there is just one parameter $\theta$. The plot of the posterior is a design by John D. Kruschke copied from his Bayesian estimation supersedes the t test program. To get the plot, first save the output of bayes.binom.test to a variable and then plot it!

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">fit <span style="color: #666666"><-</span> bayes.binom.test(c(<span style="color: #666666">9</span>, <span style="color: #666666">2</span>))
plot(fit)
</pre></div>

plot of chunk unnamed-chunk-8

summary prints out a comprehensive summary of all the parameters in the model:

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">summary(fit)
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">##   Data</span>
<span style="color: #408080; font-style: italic">## number of successes = 9, number of trials = 11</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">##   Model parameters and generated quantities</span>
<span style="color: #408080; font-style: italic">## theta: The relative frequency of success</span>
<span style="color: #408080; font-style: italic">## x_pred: Predicted number of successes in a replication</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">##   Measures</span>
<span style="color: #408080; font-style: italic">##         mean    sd HDIlo  HDIup %<comp %>comp</span>
<span style="color: #408080; font-style: italic">## theta  0.769 0.114 0.538  0.958  0.022  0.978</span>
<span style="color: #408080; font-style: italic">## x_pred 8.462 1.830 5.000 11.000  0.000  1.000</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## 'HDIlo' and 'HDIup' are the limits of a 95% HDI credible interval.</span>
<span style="color: #408080; font-style: italic">## '%<comp' and '%>comp' are the probabilities of the respective parameter being</span>
<span style="color: #408080; font-style: italic">## smaller or larger than 0.5.</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">##   Quantiles</span>
<span style="color: #408080; font-style: italic">##        q2.5%  q25% median   q75% q97.5%</span>
<span style="color: #408080; font-style: italic">## theta  0.508 0.698  0.783  0.855  0.945</span>
<span style="color: #408080; font-style: italic">## x_pred 4.000 7.000  9.000 10.000 11.000</span>
</pre></div>

diagnostics prints and plots MCMC diagnostics calculated using the coda package. In the case of bayes.binom.test MCMC diagnostics is overkill, still it is nice to get that warm fuzzy feeling you get when looking at converging MCMC chains:

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">diagnostics(fit)
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## Iterations = 1:5000</span>
<span style="color: #408080; font-style: italic">## Thinning interval = 1 </span>
<span style="color: #408080; font-style: italic">## Number of chains = 3 </span>
<span style="color: #408080; font-style: italic">## Sample size per chain = 5000 </span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">##   Diagnostic measures</span>
<span style="color: #408080; font-style: italic">##         mean    sd mcmc_se n_eff  Rhat</span>
<span style="color: #408080; font-style: italic">## theta  0.769 0.114   0.001 14795 1.000</span>
<span style="color: #408080; font-style: italic">## x_pred 8.462 1.830   0.015 14787 1.001</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## mcmc_se: the estimated standard error of the MCMC approximation of the mean.</span>
<span style="color: #408080; font-style: italic">## n_eff: a crude measure of effective MCMC sample size.</span>
<span style="color: #408080; font-style: italic">## Rhat: the potential scale reduction factor (at convergence, Rhat=1).</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">##   Model parameters and generated quantities</span>
<span style="color: #408080; font-style: italic">## theta: The relative frequency of success</span>
<span style="color: #408080; font-style: italic">## x_pred: Predicted number of successes in a replication</span>
</pre></div>

plot of chunk unnamed-chunk-10

Last but not least model.code prints out R and JAGS code that runs the model and that you can copy-n-paste directly into an R script:

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">model.code(fit)
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">## ### Model code for the Bayesian First Aid alternative to the binomial test ###</span>
<span style="color: #408080; font-style: italic">## require(rjags)</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## # Setting up the data</span>
<span style="color: #408080; font-style: italic">## x <- 9 </span>
<span style="color: #408080; font-style: italic">## n <- 11 </span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## # The model string written in the JAGS language</span>
<span style="color: #408080; font-style: italic">## model_string <-"model {</span>
<span style="color: #408080; font-style: italic">##   x ~ dbinom(theta, n)</span>
<span style="color: #408080; font-style: italic">##   theta ~ dbeta(1, 1)</span>
<span style="color: #408080; font-style: italic">##   x_pred ~ dbinom(theta, n)</span>
<span style="color: #408080; font-style: italic">## }"</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## # Running the model</span>
<span style="color: #408080; font-style: italic">## model <- jags.model(textConnection(model_string), data = list(x = x, n = n), </span>
<span style="color: #408080; font-style: italic">##                     n.chains = 3, n.adapt=1000)</span>
<span style="color: #408080; font-style: italic">## samples <- coda.samples(model, c("theta", "x_pred"), n.iter=5000)</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## # Inspecting the posterior</span>
<span style="color: #408080; font-style: italic">## plot(samples)</span>
<span style="color: #408080; font-style: italic">## summary(samples)</span>
</pre></div>

For example, in order to change the flat prior on $\theta$ into Jeffrey’s prior we simply change dbeta(1, 1) into dbeta(0.5, 0.5) and rerun the code:

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">x <span style="color: #666666"><-</span> <span style="color: #666666">9</span>
n <span style="color: #666666"><-</span> <span style="color: #666666">11</span>

model_string <span style="color: #666666"><-</span><span style="color: #BA2121">"model {</span>
<span style="color: #BA2121">  x ~ dbinom(theta, n)</span>
<span style="color: #BA2121">  theta ~ dbeta(0.5, 0.5)</span>
<span style="color: #BA2121">  x_pred ~ dbinom(theta, n)</span>
<span style="color: #BA2121">}"</span>

model <span style="color: #666666"><-</span> jags.model(textConnection(model_string), data <span style="color: #666666">=</span> list(x <span style="color: #666666">=</span> x, n <span style="color: #666666">=</span> n),
                    n.chains <span style="color: #666666">=</span> <span style="color: #666666">3</span>, n.adapt<span style="color: #666666">=1000</span>)
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">## Compiling model graph</span>
<span style="color: #408080; font-style: italic">##    Resolving undeclared variables</span>
<span style="color: #408080; font-style: italic">##    Allocating nodes</span>
<span style="color: #408080; font-style: italic">##    Graph Size: 5</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## Initializing model</span>
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%">samples <span style="color: #666666"><-</span> coda.samples(model, c(<span style="color: #BA2121">"theta"</span>, <span style="color: #BA2121">"x_pred"</span>), n.iter<span style="color: #666666">=5000</span>)

summary(samples)
</pre></div>

<div class="highlight" style="background: #f8f8f8"><pre style="line-height: 125%"><span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## Iterations = 1:5000</span>
<span style="color: #408080; font-style: italic">## Thinning interval = 1 </span>
<span style="color: #408080; font-style: italic">## Number of chains = 3 </span>
<span style="color: #408080; font-style: italic">## Sample size per chain = 5000 </span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## 1. Empirical mean and standard deviation for each variable,</span>
<span style="color: #408080; font-style: italic">##    plus standard error of the mean:</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">##         Mean    SD Naive SE Time-series SE</span>
<span style="color: #408080; font-style: italic">## theta  0.792 0.113 0.000919       0.000926</span>
<span style="color: #408080; font-style: italic">## x_pred 8.717 1.791 0.014626       0.014594</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">## 2. Quantiles for each variable:</span>
<span style="color: #408080; font-style: italic">## </span>
<span style="color: #408080; font-style: italic">##         2.5%   25%   50%    75%  97.5%</span>
<span style="color: #408080; font-style: italic">## theta  0.536 0.723 0.809  0.877  0.961</span>
<span style="color: #408080; font-style: italic">## x_pred 5.000 8.000 9.000 10.000 11.000</span>
</pre></div>

To leave a comment for the author, please follow the link and comment on their blog: Publishable Stuff.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)