# Effect of sample size on the accuracy of Cohen’s d estimates (95 % CI)

**R Psychologist » R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

## Introduction

I’ve been incredibly busy the last month, amongst other things I’ve moved about 400 miles from Umeå, in the north of sweden, to Stockholm. However, I’ve been working a lot with R and especially with power analysis both via Monte Carlo simulations and via analytical approaches. I will not write about power analysis here, but I will write about a closely related concept about sample size planning for *accuracy in parameter estimation (AIPE)*. Whereas traditional power analysis is used to plan for an adequate sample size to reject the null hypothesis at the desired alpha level, sample size planning for AIPE is used to plan for a desired with of the CI. AIPE functions are implemented in the MBESS-package by Kelly and Lai, if you’re interested you can read more about AIPE in Maxwell, Kelley, and Rausch (2008).

## Graphs

Here’s two graphs I did in R to illustrate the connection between sample size and confidence interval. In Figure 2 you can see that that sample size will increase as a function of the width *and* and the magnitude of the effect, i.e. you need a larger sample to achieve a certain CI width the larger the effect size is. You also see that this increase in sample size is less apparent the larger the CI’s width become.

Figure 1. 95% confidence interval for Cohen’s *d* of 0.8 in relation to sample size (per group), value above the error bars represent the CI’s range.

Figure 2. 95% confidence interval for different magnitudes of Cohen’s *d* in relation to sample sizes (per group) and width of confidence intervals.

## R code

library(MBESS) library(ggplot2) # CI for d = 0.8 ---------------------------------------------------------- smd_plot <- data.frame() for(i in seq(10,400, by=10)) { # loop x.ci <- ci.smd(smd=0.8, n.1=i,n.2=i) smd_plot <- rbind(smd_plot, data.frame("lwr" = x.ci$Lower.Conf.Limit.smd, "upr" = x.ci$Upper.Conf.Limit.smd, "smd"=0.8, "n" = i)) } smd_plot$range <- round(smd_plot$upr - smd_plot$lwr,2) # ggplot -------------------------------------------------------------------- ggplot(smd_plot, aes(n, smd)) + geom_point() + geom_errorbar(aes(ymin=lwr, ymax=upr)) + geom_text(aes(label=range, y=upr), hjust=-0.4, angle=45, size=4) + scale_y_continuous(breaks=seq(0,2, by=0.25)) + scale_x_continuous(breaks=seq(0,400, by=20)) + ylim(-.2,1.8) # diffrent ds, widths and sample sizes ------------------------------------ ss2 <- NULL # nested loops to run ss.aie.smd with different deltas and widths for(j in seq(0.2, 1, by=0.2)) { ss <- NULL for(i in seq(0.1,2,by=0.2)) { ss <- c(ss, ss.aipe.smd(delta=i, width=j)) } if(j == 0.2) { ss2 <- data.frame("n" = ss) ss2$width <- j } else { ss_tmp <- data.frame("n" = ss) ss_tmp$width <- j ss2 <- rbind(ss2, ss_tmp) } } ss2$delta <- rep(seq(0.1,2,by=0.2), times=5) # add deltas used in loop # ggplot ------------------------------------------------------------------ ggplot(ss2, aes(delta, n, group=factor(width), linetype=factor(width))) + geom_line()

## References

Cohen J. 1994. The earth is round (p < .05). *Am. Psychol*. 49(12):997–1003

Maxwell, S. E., Kelley, K., & Rausch, J. R. (2008). Sample size planning for statistical power and accuracy in parameter estimation. *Annu. Rev. Psychol.*, 59, 537-563.

**leave a comment**for the author, please follow the link and comment on their blog:

**R Psychologist » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.