**Computational Ecology**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The power of a statistical test is the probability that a null

hypothesis will be rejected when the alternative hypothesis is true.

In lay terms, power is your ability to refine or “prove” your

expectations from the data you collect. The most frequent motivation

for estimating the power of a study is to figure out what sample size

will be needed to observe a treatment effect. Given a set of pilot

data or some other estimate of the variation in a sample, we can use

power analysis to inform how much additional data we should collect.

I recently did a power analysis on a set of pilot data for a long-term

monitoring study of the US National Park Service. I thought I would

share some of the things I learned and a bit of R code for others that

might need to do something like this. If you aren’t into power

analysis, the code below may still be useful as examples of how to use

the error handling functions in R (`withCallingHandlers`

,

`withRestarts`

), parallel programming using the `snow`

package, and linear mixed effect regression using `nlme`

. If you

have any suggestions for improvement or if I got something wrong on

the analysis, I’d love to hear from you.

## 1 The Study

The study system was cobblebars along the Cumberland river in Big

South Fork National Park (Kentucky and Tennessee, United States).

Cobblebars are typically dominated by grassy vegetation that include

disjunct tall-grass prairie species. It is hypothesized that woody

species will encroach onto cobblebars if they are not seasonally

scoured by floods. The purpose of the NPS sampling was to observe

changes in woody cover through time. The study design consisted of

two-stages of clustering: the first being cobblebars, and the second

being transects within cobblebars. The response variable was the

percentage of the transect that was woody vegetation. Because of

the clustered design, the inferential model for this study design

has mixed-effects. I used a simple varying intercept

model:

where *y* is the percent of each transect in woody vegetation sampled

*n* times within *J* cobblebars, each with *K* transects. The

parameter of inference for the purpose of monitoring change in woody

vegetation through time is *β*, the rate at which cover changes as

a function of time. *α*, *γ*, *σ ^{2}_{γ}*, and

*σ*are hyper-parameters that describe the hierarchical

^{2}_{y}variance structure inherent in the clustered sampling design.

Below is the function code used I used to regress the pilot data. It

should be noted that with this function you can log or logit transform

the response variable (percentage of transect that is woody). I put

this in because the responses are proportions (0,1) and errors should

technically follow a beta-distribution. Log and logit transforms with

Gaussian errors could approximate this. I ran all the models with

transformed and untransformed response, and the results did not vary

at all. So, I stuck with untransformed responses:

Model <- function(x = cobblebars, type = c("normal","log","logit")){ ## Transforms if (type[1] == "log") x$prop.woody <- log(x$prop.woody) else if (type[1] == "logit") x$prop.woody <- log(x$prop.woody / (1 - x$prop.woody)) mod <- lme(prop.woody ~ year, data = x, random = ~ 1 | cobblebar/transect, na.action = na.omit, control = lmeControl(opt = "optim", maxIter = 800, msMaxIter = 800) ) mod$type <- type[1] return(mod) }

Here are the results from this regression of the pilot data:

Linear mixed-effects model fit by REML Data: x AIC BIC logLik -134.4319 -124.1297 72.21595 Random effects: Formula: ~1 | cobblebar (Intercept) StdDev: 0.03668416 Formula: ~1 | transect %in% cobblebar (Intercept) Residual StdDev: 0.02625062 0.05663784 Fixed effects: prop.woody ~ year Value Std.Error DF t-value p-value (Intercept) 0.12966667 0.01881983 29 6.889896 0.0000 year -0.00704598 0.01462383 29 -0.481815 0.6336 Correlation: (Intr) year -0.389 Number of Observations: 60 Number of Groups: cobblebar transect %in% cobblebar 6 30

## 2 We don’t learn about power analysis and complex models

When I decided upon the inferential model the first thing that

occurred to me was that I never learned in any statistics course I

had taken how to do such a power analysis on a multi-level model.

I’ve taken more statistics courses than I’d like to count and taught

my own statistics courses for undergrads and graduate students, and

the only exposure to power analysis that I had was in the context of

simple t-tests or ANOVA. You learn about it in your first 2

statistics courses, then it rarely if ever comes up again until you

actually need it.

I was, however, able to find a great resource on power analysis from

a Bayesian perspective in the excellent book “Data Analysis Using

Regression and Multilevel/Hierarchical Models” by Andrew Gelman and

Jennifer Hill. Andrew Gelman has thought and debated about power

analysis and you can get more from his blog. The approach in the

book is a simulation-based one and I have adopted it for this

analysis.

## 3 Analysis Procedure

For the current analysis we needed to know three things: effect

size, sample size, and estimates of population variance. We set

effect size beforehand. In this context, the parameter of interest

is the rate of change in woody cover through time *β*, and

effect size is simply how large or small a value of *β* you want

to distinguish with a regression. Sample size is also set *a priori*. In the analysis we want to vary sample size by varying the

number of cobblebars, the number of transects per cobblebar or the

number of years the study is conducted.

The population variance cannot be known precisely, and this is where

the pilot data come in. By regressing the pilot data using the

model we can obtain estimates of all the different components of the

variance (cobblebars, transects within cobblebars, and the residual

variance). Below is the R function that will return all the

hyperparameters (and *β*) from the regression:

GetHyperparam<-function(x,b=NULL){ ## Get the hyperparameters from the mixed effect model fe <- fixef(x) if(is.null(b)) b<-fe[2] # use the data effect size if not supplied mu.a <- fe[1] vc <- VarCorr(x) sigma.y <- as.numeric(vc[5, 2]) # Residual StdDev sigma.a <- as.numeric(vc[2, 2]) # Cobblebar StdDev sigma.g <- as.numeric(vc[4, 2]) # Cobblebar:transect StdDev hp<-c(b, mu.a, sigma.y, sigma.a, sigma.g) names(hp)<-c("b", "mu.a", "sigma.y", "sigma.a", "sigma.g") return(hp) }

To calculate power we to regress the simulated data in the same way we

did the pilot data, and check for a significant *β*. Since

optimization is done using numeric methods there is always the chance

that the optimization will not work. So, we make sure the regression

on the fake data catches and recovers from all errors. The solution

for error recovery is to simply try the regression on a new set of

fake data. This function is a pretty good example of using the R

error handling function `withCallingHandlers`

and

`withRestarts`

.

fakeModWithRestarts <- function(m.o, n = 100, ...){ ## A Fake Model withCallingHandlers({ i <- 0 mod <- NULL while (i < n & is.null(mod)){ mod <- withRestarts({ f <- fake(m.orig = m.o, transform = F, ...) return(update(m.o, data = f)) }, rs = function(){ i <<- i + 1 return(NULL) }) } if(is.null(mod)) warning("ExceededIterations") return(mod) }, error = function(e){ invokeRestart("rs") }, warning = function(w){ if(w$message == "ExceededIterations") cat("\n", w$message, "\n") else invokeRestart("rs") }) }

To calculate the power of a particular design we run

`fakeModWithRestarts`

1000 times and look at the proportion of

significant *β* values:

dt.power <- function (m, n.sims = 1000, alpha=0.05, ...){ ## Calculate power for a particular sampling design signif<-rep(NA, n.sims) for(i in 1:n.sims){ lme.power <- fakeModWithRestarts(m.o = m, ...) if(!is.null(lme.power)) signif[i] <- summary(lme.power)$tTable[2, 5] < alpha } power <- mean(signif, na.rm = T) return(power) }

Finally, we want to perform this analysis on many different sampling

designs. In my case I did all combinations of set of effect sizes,

cobblebars, transects, and years. So, I generated the appropriate designs:

factoredDesign <- function(Elevs = 0.2/c(1,5,10,20), Nlevs = seq(2, 10, by = 2), Jlevs = seq(4, 10, by = 2), Klevs = c(3, 5, 7), ...){ ## Generates factored series of sampling designs for simulation ## of data that follow a particular model. ## Inputs: ## Elevs - vector of effect sizes for the slope parameter. ## Nlevs - vector of number of years to sample. ## Jlevs - vector of number of cobblebars to sample. ## Klevs - vector of number of transects to sample. ## Results: ## Data frame with where columns are the factors and ## rows are the designs. # Level lengths lE <- length(Elevs) lN <- length(Nlevs) lJ <- length(Jlevs) lK <- length(Klevs) # Generate repeated vectors for each factor E <- rep(Elevs, each = lN*lJ*lK) N <- rep(rep(Nlevs, each = lJ*lK), times = lE) J <- rep(rep(Jlevs, each = lK), times = lE*lN) K <- rep(Klevs, times = lE*lN*lJ) return(data.frame(E, N, J, K)) }

Once we know our effect sizes, the different sample sizes we want,

and the estimates of population variance we can generate simulated

dataset that are similar to the pilot data. To calculate power we

simply simulate a large number of dataset and calculate the

proportion of slopes, *β* that are significantly different from

zero (p-value < 0.05). This procedure is repeated for all the

effect sizes and sample sizes of interest. Here is the code for

generating a simulated dataset. It also does the work of doing the

inverse transform of the response variables if necessary.

fake <- function(N = 2, J = 6, K = 5, b = NULL, m.orig = mod, transform = TRUE, ...){ ## Simulated Data for power analysis ## N = Number of years ## J = Number of cobblebars ## K = Number of transects within cobblebars year <- rep(0:(N-1), each = J*K) cobblebar <- factor(rep(rep(1:J, each = K), times = N)) transect <- factor(rep(1:K, times = N*J)) ## Simulated parameters hp<-GetHyperparam(x=m.orig) if(is.null(b)) b <- hp['b'] g <- rnorm(J*K, 0, hp['sigma.g']) a <- rnorm(J*K, hp['mu.a'] + g, hp['sigma.a']) ## Simulated responses eta <- rnorm(J*K*N, a + b * year, hp['sigma.y']) if (transform){ if (m.orig$type == "normal"){ y <- eta y[y > 1] <- 1 # Fix any boundary problems. y[y < 0] <- 0 } else if (m.orig$type == "log"){ y <- exp(eta) y[y > 1] <- 1 } else if (m.orig$type == "logit") y <- exp(eta) / (1 + exp(eta)) } else{ y <- eta } return(data.frame(prop.woody = y, year, transect, cobblebar)) }

Then I performed the power calculations on each of these designs. This

could take a long time, so I set this procedure to use parallel processing

if needed. Note that I had to re-~source~ the file with all the

necessary functions for each processor.

powerAnalysis <- function(parallel = T, ...){ ## Full Power Analysis ## Parallel if(parallel){ closeAllConnections() cl <- makeCluster(7, type = "SOCK") on.exit(closeAllConnections()) clusterEvalQ(cl, source("cobblebars2.r")) } ## The simulations dat <- factoredDesign(...) if (parallel){ dat$power <- parRapply(cl, dat, function(x,...){ dt.power(N = x[2], J = x[3], K = x[4], b = x[1], ...) }, ...) } else { dat$power <- apply(dat, 1, function(x, ...){ dt.power(N = x[2], J = x[3], K = x[4], b = x[1], ...) }, ...) } return(dat) }

The output of the `powerAnalysis`

function is a data frame with

columns for the power and all the sample design settings. So, I wrote

a custom plotting function for this data frame:

plotPower <- function(dt){ xyplot(power~N|J*K, data = dt, groups = E, panel = function(...){panel.xyplot(...) panel.abline(h = 0.8, lty = 2)}, type = c("p", "l"), xlab = "sampling years", ylab = "power", strip = strip.custom(var.name = c("C", "T"), strip.levels = c(T, T)), auto.key = T ) }

Below is the figure for the cobblebar power analysis. I won’t go into

detail on what the results mean since I am concerned here with

illustrating the technique and the R code. Obviously, as the number of

cobblebars and transects per year increase, so does power. And, as

the effect size increases, observing it with a test is easier.

Date: 2009-09-18 Fri

HTML generated by org-mode 6.30trans in emacs 22

**leave a comment**for the author, please follow the link and comment on their blog:

**Computational Ecology**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.