Abstract
This article provide a brief background about power and sample size analysis. Then, power and sample size analysis is computed for the Z test.
Next articles will describe power and sample size analysis for:
- one sample and two samples t test;,
- p test, chi-square test, correlation;
- one-way ANOVA;
- DOE
.
Finally, a PDF article showing both the underlying methodology and the R code here provided, will be published.
Background
Power and sample size analysis are important tools for assessing the ability of a statistical test to detect when a null hypothesis is false, and for deciding what sample size is required for having a reasonable chance to reject a false null hypothesis.
The following four quantities have an intimate relationship:
- sample size
- effect size
- significance level = P(Type I error) = probability of finding an effect that is not there
- power = 1 - P(Type II error) = probability of finding an effect that is there
Given any three, we can determine the fourth.
Z test
The formula for the power computation can be implemented in R, using a function like the following:
powerZtest = function(alpha = 0.05, sigma, n, delta){
zcr = qnorm(p = 1-alpha, mean = 0, sd = 1)
s = sigma/sqrt(n)
power = 1 - pnorm(q = zcr, mean = (delta/s), sd = 1)
return(power)
} |
In the same way, the function to compute the sample size can be built.
sampleSizeZtest = function(alpha = 0.05, sigma, power, delta){
zcra=qnorm(p = 1-alpha, mean = 0, sd=1)
zcrb=qnorm(p = power, mean = 0, sd = 1)
n = round((((zcra+zcrb)*sigma)/delta)^2)
return(n)
} |
The above code is provided for didactic purpose. In fact, the pwr package provide a function to perform power and sample size analysis.
install.packages("pwr")
library(pwr) |
The function pwr.norm.test() computes parameters for the Z test. It accepts the four parameters see above, one of them passed as NULL. The parameter passed as NULL is determined from the others.
Some examples
Power at
for
against
.
,
, 
sigma = 15 h0 = 100 ha = 105 |
This is the result with the self-made function:
> powerZtest(n = 20, sigma = sigma, delta = (ha-h0)) [1] 0.438749 |
And here the same with the pwr.norm.test() function:
> d = (ha - h0)/sigma
> pwr.norm.test(d = d, n = 20, sig.level = 0.05, alternative = "greater")
Mean power calculation for normal distribution with known variance
d = 0.3333333
n = 20
sig.level = 0.05
power = 0.438749
alternative = greater |
The sample size of the test for power equal to 0.80 can be computed using the self-made function
> sampleSizeZtest(sigma = sigma, power = 0.8, delta = (ha-h0)) [1] 56 |
or with the pwr.norm.test() function:
> pwr.norm.test(d = d, power = 0.8, sig.level = 0.05, alternative = "greater")
Mean power calculation for normal distribution with known variance
d = 0.3333333
n = 55.64302
sig.level = 0.05
power = 0.8
alternative = greater |
The power function can be drawn:
ha = seq(95, 125, l = 100) pwrTest = pwr.norm.test(d = d, n = 20, sig.level = 0.05, alternative = "greater")$power plot(d, pwrTest, type = "l", ylim = c(0, 1)) |
View (and download) the full code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | ### Self-made functions to perform power and sample size analysis
powerZtest = function(alpha = 0.05, sigma, n, delta){
zcr = qnorm(p = 1-alpha, mean = 0, sd = 1)
s = sigma/sqrt(n)
power = 1 - pnorm(q = zcr, mean = (delta/s), sd = 1)
return(power)
}
sampleSizeZtest = function(alpha = 0.05, sigma, power, delta){
zcra=qnorm(p = 1-alpha, mean = 0, sd=1)
zcrb=qnorm(p = power, mean = 0, sd = 1)
n = round((((zcra+zcrb)*sigma)/delta)^2)
return(n)
}
### Load pwr package to perform power and sample size analysis
library(pwr)
### Data
sigma = 15
h0 = 100
ha = 105
### Power analysis
# Using the self-made function
powerZtest(n = 20, sigma = sigma, delta = (ha-h0))
# Using the pwr package
pwr.norm.test(d = (ha - h0)/sigma, n = 20, sig.level = 0.05, alternative = "greater")
### Sample size analysis
# Using the self-made function
sampleSizeZtest(sigma = sigma, power = 0.8, delta = (ha-h0))
# Using the pwr package
pwr.norm.test(d = (ha - h0)/sigma, power = 0.8, sig.level = 0.05, alternative = "greater")
### Power function for the two-sided alternative
ha = seq(95, 125, l = 100)
d = (ha - h0)/sigma
pwrTest = pwr.norm.test(d = d, n = 20, sig.level = 0.05, alternative = "greater")$power
plot(d, pwrTest, type = "l", ylim = c(0, 1)) |
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series,ecdf, trading) and more...


Zero Inflated Models and Generalized Linear Mixed Models with R.
Zuur, Saveliev, Ieno (2012).