Power and Sample Size Analysis: Z test

October 17, 2012
By

(This article was first published on Milano R net, and kindly contributed to R-bloggers)

Abstract

This article provide a brief background about power and sample size analysis. Then, power and sample size analysis is computed for the Z test.
Next articles will describe power and sample size analysis for:

  • one sample and two samples t test;,
  • p test, chi-square test, correlation;
  • one-way ANOVA;
  • DOE 2^k.

Finally, a PDF article showing both the underlying methodology and the R code here provided, will be published.

Background

Power and sample size analysis are important tools for assessing the ability of a statistical test to detect when a null hypothesis is false, and for deciding what sample size is required for having a reasonable chance to reject a false null hypothesis.

The following four quantities have an intimate relationship:

  1. sample size
  2. effect size
  3. significance level = P(Type I error) = probability of finding an effect that is not there
  4. power = 1 - P(Type II error) = probability of finding an effect that is there

Given any three, we can determine the fourth.

Z test

The formula for the power computation can be implemented in R, using a function like the following:

powerZtest = function(alpha = 0.05, sigma, n, delta){
  zcr = qnorm(p = 1-alpha, mean = 0, sd = 1)
  s = sigma/sqrt(n)
  power = 1 - pnorm(q = zcr, mean = (delta/s), sd = 1)
  return(power)
}

In the same way, the function to compute the sample size can be built.

sampleSizeZtest = function(alpha = 0.05, sigma, power, delta){
  zcra=qnorm(p = 1-alpha, mean = 0, sd=1)
  zcrb=qnorm(p = power, mean = 0, sd = 1)
  n = round((((zcra+zcrb)*sigma)/delta)^2)
  return(n)
}

The above code is provided for didactic purpose. In fact, the pwr package provide a function to perform power and sample size analysis.

install.packages("pwr")
library(pwr)

The function pwr.norm.test() computes parameters for the Z test. It accepts the four parameters see above, one of them passed as NULL. The parameter passed as NULL is determined from the others.

Some examples

Power at \mu = 105 for H0: \mu = 100 against H1: \mu>100.
\sigma = 15, n = 20, \alpha = 0.05

sigma = 15
h0 = 100
ha = 105

This is the result with the self-made function:

> powerZtest(n = 20, sigma = sigma, delta = (ha-h0))
[1] 0.438749

And here the same with the pwr.norm.test() function:

> d = (ha - h0)/sigma
> pwr.norm.test(d = d, n = 20, sig.level = 0.05, alternative = "greater")
 
     Mean power calculation for normal distribution with known variance 
 
              d = 0.3333333
              n = 20
      sig.level = 0.05
          power = 0.438749
    alternative = greater

The sample size of the test for power equal to 0.80 can be computed using the self-made function

> sampleSizeZtest(sigma = sigma, power = 0.8, delta = (ha-h0))
[1] 56

or with the pwr.norm.test() function:

> pwr.norm.test(d = d, power = 0.8, sig.level = 0.05, alternative = "greater")
 
     Mean power calculation for normal distribution with known variance 
 
              d = 0.3333333
              n = 55.64302
      sig.level = 0.05
          power = 0.8
    alternative = greater

The power function can be drawn:

ha = seq(95, 125, l = 100)
pwrTest = pwr.norm.test(d = d, n = 20, sig.level = 0.05, alternative = "greater")$power
plot(d, pwrTest, type = "l", ylim = c(0, 1))

View (and download) the full code:

?Download powerZtest.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
### Self-made functions to perform power and sample size analysis
powerZtest = function(alpha = 0.05, sigma, n, delta){
  zcr = qnorm(p = 1-alpha, mean = 0, sd = 1)
  s = sigma/sqrt(n)
  power = 1 - pnorm(q = zcr, mean = (delta/s), sd = 1)
  return(power)
}
 
sampleSizeZtest = function(alpha = 0.05, sigma, power, delta){
  zcra=qnorm(p = 1-alpha, mean = 0, sd=1)
  zcrb=qnorm(p = power, mean = 0, sd = 1)
  n = round((((zcra+zcrb)*sigma)/delta)^2)
  return(n)
}
 
### Load pwr package to perform power and sample size analysis
library(pwr)
 
### Data
sigma = 15
h0 = 100
ha = 105
 
### Power analysis
# Using the self-made function
powerZtest(n = 20, sigma = sigma, delta = (ha-h0))
# Using the pwr package
pwr.norm.test(d = (ha - h0)/sigma, n = 20, sig.level = 0.05, alternative = "greater")
 
### Sample size analysis
# Using the self-made function
sampleSizeZtest(sigma = sigma, power = 0.8, delta = (ha-h0))
# Using the pwr package
pwr.norm.test(d = (ha - h0)/sigma, power = 0.8, sig.level = 0.05, alternative = "greater")
 
### Power function for the two-sided alternative
ha = seq(95, 125, l = 100)
d = (ha - h0)/sigma
pwrTest = pwr.norm.test(d = d, n = 20, sig.level = 0.05, alternative = "greater")$power
plot(d, pwrTest, type = "l", ylim = c(0, 1))

To leave a comment for the author, please follow the link and comment on his blog: Milano R net.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.