(This article was first published on

It is well known the binomial test never has an error of exactly 5%. You aim for at most 5%, calculate the number correct to get there and end up with an error of e.g 2%. This is a shame but there is no solution. However, it is also an opportunity; the 'unused' error may be employed for additional testing. For instance, in a triangle test, why not aim for say 30 persons, do a pre-test at 17 persons where H0 is to be rejected at less than 1% error level. When not rejected, continue to 30, reject at the original 5% and still have an overall error level of less than 5%?**Wiekvoet**, and kindly contributed to R-bloggers)### Example

As described before (blog entry) in a triangle test is a sensory test where there is a chance of 1/3 to select the correct product. When the proportion correct is significantly larger than 1/3 the products are deemed different. This latter part is performed with a binomial test.

With 30 trials the number of correct needs to be 14 with resulting alpha 0.043

n_2 <- 30

(critVal_2 <- qbinom(.95,n_2,1/3))

[1] 14

pbinom(critVal_2,n_2,1/3,lower.tail=FALSE)

[1] 0.04348228

With 17 trials:

n_1 <- 17

(critVal_1 <- qbinom(.98,n_1,1/3))

[1] 10

pbinom(critVal_1,n_1,1/3,lower.tail=FALSE)

[1] 0.00800819

As can be seen the errors add to more than 0.05. However, they don't need to be added. The only way to get in the 30 trials persons situation is to have less than 10 correct in the first phase. This conditional value can be calculated easily with a few functions. What is done is to examine for each number of correct in the first phase what the chance is to get sufficient correct in the second phase to reach the critical value. These numbers are multiplied with the chance to get the corresponding number correct in the first phase and added. This is shown in the next two functions.

condPval_S1 <- function(nFound_1,n_1,n_2,critVal_2) {

nAdditional <- n_2-n_1

if (nAdditional < critVal_2-nFound_1+1) 0

else {

pbinom(critVal_2-nFound_1,

nAdditional,1/3,lower.tail=FALSE)

}

}

condPval <- function(n_1,critVal_1,n_2,alpha2=0.05) {

critVal_2 <- qbinom(1-alpha2,n_2,1/3)

nFound <- 0:critVal_1

sa <- sapply(nFound,function(nFound_1)

dbinom(nFound_1,n_1,1/3)*

condPval_S1(nFound_1,n_1,n_2,critVal_2))

sum(sa)

}

condPval(n_1,critVal_1,n_2)+p_H0_1

[1] 0.04568412

The total error level is slightly less than 5%. Hence we can do this even while we keep to the 5% level which is promised.

#### A bit more extensive

In practice, not everybody asked to come and do the triangle test will be there to taste. What if there are a few trials short or extra? Obviously this can be calculated as well. The apply function helps greatly. The overall level is for all these cases is under 5%.

range(sapply(25:35,function(n_2)

condPval(n_1,critVal_1,n_2)+p_H0_1 ))

[1] 0.02484166 0.04711329

This can also be put into a function with a bit more details:

McondPval <- function(n_1,

n_2_min = round(n_1*1.5),

n_2_max = 3*n_1) {

critVal_1 <- qbinom(.98,n_1,1/3)

p_H0_1 <- pbinom(critVal_1,n_1,1/3,lower.tail=FALSE)

n_2 <- n_2_min:n_2_max

alpha <- sapply(n_2,function(n_2)

condPval(n_1,critVal_1,n_2)+p_H0_1 )

critVal_2 <- qbinom(.95,n_2,1/3)

alpha_orig <- pbinom(critVal_2,n_2,1/3,lower.tail=FALSE)

return(data.frame(n_1,n_2,alpha,alpha_orig))

}

McondPval(17,25,35)

n_1 n_2 alpha alpha_orig

1 17 25 0.04277008 0.04151368

2 17 26 0.02729441 0.02475400

3 17 27 0.03792652 0.03592712

4 17 28 0.02484166 0.02156168

5 17 29 0.03384347 0.03113864

6 17 30 0.04568412 0.04348228

7 17 31 0.03037324 0.02702409

8 17 32 0.04047668 0.03765334

9 17 33 0.02740706 0.02348101

10 17 34 0.03605287 0.03265134

11 17 35 0.04711329 0.04419916

Unfortunately, it is not always this nice. With these settings at 16 trials in the first phase it may go wrong. Look at 30 and 35 trials total. The 30 trials is just over 5%, while the 35 is clearly over it. Either the test in phase 1 should be more stringent or it should be ensured not to end with 35 trials at the end of testing. It does not matter which of these is chosen but we have to choose. Ideally the level of testing at phase 1 should be determined prior to knowing how many correct there are.

McondPval(16,25,35)

n_1 n_2 alpha alpha_orig

1 16 25 0.04648649 0.04151368

2 16 26 0.03245610 0.02475400

3 16 27 0.04236557 0.03592712

4 16 28 0.03048510 0.02156168

5 16 29 0.03885603 0.03113864

6 16 30 0.05006582 0.04348228

7 16 31 0.03584845 0.02702409

8 16 32 0.04538488 0.03765334

9 16 33 0.03326027 0.02348101

10 16 34 0.04140208 0.03265134

11 16 35 0.05194322 0.04419916

### Conclusion

With a few simple functions and a bit of care an extra hypothesis test can be added during a triangle test. This gives opportunity to declare differences at an intermediate step while retaining the original error level.

To

**leave a comment**for the author, please follow the link and comment on his blog:**Wiekvoet**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...