# Creating a typical textbook illustration of statistical power using either ggplot or base graphics

May 26, 2013
By

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A common way of illustrating the idea behind statistical power in null hypothesis significance testing, is by plotting the sampling distributions of the null hypothesis ( $H_0$) and the alternative hypothesis ( $H_A$). Typically, these illustrations highlight the regions that correspond to making a type II error ( $\beta$), type I error ( $\alpha$) and correctly rejecting the null hypothesis (i.e. the test’s power; $1 - \beta$).

In this post I will show how to create such “power plots” using R. Typically, I prefer to use ggplot for plotting, but tasks such as this is one of the few times were I think R’s base graphics have some merit—especially for creating black and white plots, since ggplot does not support using patterns. Thus, I will present code both for ggplot and base graphics.

Creating these plots is pretty straight forwards. You only need to be vaguely familiar with the mechanics of plotting polygons. For instance, say we want to plot a triangle with the following coordinates.

       (2,3)
/\
/  \
/    \
/      \
(1,2) ------ (3,2)


Then we just specify x and y as vectors, like this:

# ggplot polygon example
ggplot(data.frame(x=c(1,2,3),y=c(2,3,2)), aes(x,y)) + geom_polygon() So, let us begin by creating the data for the two distributions and three polygons that we will need.

library(ggplot2)
library(grid) # need for arrow()
m1 <- 0  # mu H0
sd1 <- 1.5 # sigma H0
m2 <- 3.5 # mu HA
sd2 <- 1.5 # sigma HA

z_crit <- qnorm(1-(0.05/2), m1, sd1)

# set length of tails
min1 <- m1-sd1*4
max1 <- m1+sd1*4
min2 <- m2-sd2*4
max2 <- m2+sd2*4
# create x sequence
x <- seq(min(min1,min2), max(max1, max2), .01)
# generate normal dist #1
y1 <- dnorm(x, m1, sd1)
# put in data frame
df1 <- data.frame("x" = x, "y" = y1)
# generate normal dist #2
y2 <- dnorm(x, m2, sd2)
# put in data frame
df2 <- data.frame("x" = x, "y" = y2)

# Alpha polygon
y.poly <- pmin(y1,y2)
poly1 <- data.frame(x=x, y=y.poly)
poly1 <- poly1[poly1$x >= z_crit, ] poly1<-rbind(poly1, c(z_crit, 0)) # add lower-left corner # Beta polygon poly2 <- df2 poly2 <- poly2[poly2$x <= z_crit,]
poly2<-rbind(poly2, c(z_crit, 0))  # add lower-left corner

# power polygon; 1-beta
poly3 <- df2
poly3 <- poly3[poly3$x >= z_crit,] poly3 <-rbind(poly3, c(z_crit, 0)) # add lower-left corner # combine polygons. poly1$id <- 3 # alpha, give it the highest number to make it the top layer
poly2$id <- 2 # beta poly3$id <- 1 # power; 1 - beta
poly <- rbind(poly1, poly2, poly3)
poly$id <- factor(poly$id,  labels=c("power","beta","alpha"))


Now that we have all the data that we need, let us create the first plot using ggplot. The annotation is set manually, so it will be a bit tedious to change these plots.


# plot with ggplot2
ggplot(poly, aes(x,y, fill=id, group=id)) +
geom_polygon(show_guide=F, alpha=I(8/10)) +
# add line for treatment group
geom_line(data=df1, aes(x,y, color="H0", group=NULL, fill=NULL), size=1.5, show_guide=F) +
# add line for treatment group. These lines could be combined into one dataframe.
geom_line(data=df2, aes(color="HA", group=NULL, fill=NULL),size=1.5, show_guide=F) +
geom_vline(xintercept = z_crit, size=1, linetype="dashed") +
# change colors
scale_color_manual("Group",
values= c("HA" = "#981e0b","H0" = "black")) +
scale_fill_manual("test", values= c("alpha" = "#0d6374","beta" = "#be805e","power"="#7cecee")) +
# beta arrow
annotate("segment", x=0.1, y=0.045, xend=1.3, yend=0.01, arrow = arrow(length = unit(0.3, "cm")), size=1) +
annotate("text", label="beta", x=0, y=0.05, parse=T, size=8) +
# alpha arrow
annotate("segment", x=4, y=0.043, xend=3.4, yend=0.01, arrow = arrow(length = unit(0.3, "cm")), size=1) +
annotate("text", label="frac(alpha,2)", x=4.2, y=0.05, parse=T, size=8) +
# power arrow
annotate("segment", x=6, y=0.2, xend=4.5, yend=0.15, arrow = arrow(length = unit(0.3, "cm")), size=1) +
annotate("text", label="1-beta", x=6.1, y=0.21, parse=T, size=8) +
# H_0 title
annotate("text", label="H", x=m1, y=0.28, parse=T, size=8) +
# H_a title
annotate("text", label="H[a]", x=m2, y=0.28, parse=T, size=8) +
ggtitle("Statistical Power Plots, Textbook-style") +
# remove some elements
theme(panel.grid.minor = element_blank(),
panel.grid.major = element_blank(),
panel.background = element_blank(),
plot.background = element_rect(fill="#f9f0ea"),
panel.border = element_blank(),
axis.line = element_blank(),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank(),
axis.title.x = element_blank(),
axis.title.y = element_blank(),
plot.title = element_text(size=22))

ggsave("stat_power_ggplot.png", height=8, width=13, dpi=72) Now, if we want a more “classical looking” black and white-plot, we need to use base graphics.

# example with base graphics
png("stat_power_base.png", width=900, height=600, units="px") # save as png
# reset
plot.new()
# set window size
plot.window(xlim=range(x), ylim=c(-0.01,0.3))
polygon(poly3,  density=10) # 1-beta
polygon(poly2, density=3, angle=-45, lty="dashed") # beta
polygon(poly1, density=10, angle=0) # alpha
lines(df2,lwd=3)
lines(df1,lwd=3)
### annotations
# h_0 title
text(m1, 0.3, expression(H), cex=1.5)
# h_a title
text(m2, 0.3, expression(H[a]), cex=1.5)
# beta annotation
arrows(x0=-1, y0=0.045, x1=1, y1=0.01,lwd=2,length=0.15)
text(-1.2, 0.045, expression(beta), cex=1.5)
# beta annotation
arrows(x0=4, y0=-0.01, x1=3.5, y1=0.01, lwd=2, length=0.15)
text(x=4.1, y=-0.015, expression(alpha/2), cex=1.5)
# 1-beta
arrows(x0=6, y0=0.15, x1=5, y1=0.1, lwd=2,length=0.15)
text(x=7, y=0.155, expression(paste(1-beta, "  (\"power\")")), cex=1.5)
# show z_crit; start of rejection region
abline(v=z_crit)
abline(h=0)
title("Statistical Power")
dev.off() R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.