Exploring P-values with Simulations in R

May 21, 2016

(This article was first published on R – Stable Markets, and kindly contributed to R-bloggers)

The recent flare-up in discussions on p-values inspired me to conduct a brief simulation study.

In particularly, I wanted to illustrate just how p-values vary with different effect and sample sizes.
Here are the details of the simulation. I simulated n draws of my independent variable X :

X_n \sim N(100, 400)


n \in \{5,6,...,25\}

For each X_n , I define a Y_n as

Y_n := 10+\beta X_n +\epsilon


\epsilon \sim N(0,1)
\beta \in \{.05, .06,..., .25 \}

In other words, for each effect size, \beta , the simulation draws X and Y with some error \epsilon . The following regression model is estimated and the p-value of \beta is observed.

Y_n = \beta_0 + \beta X_n

The drawing and the regression is done 1,000 times so that for each effect size – sample size combination, the simulation yields 1,000 p-values. The average of these 1,000 p-values for each effect size and sample size combination is plotted below.

Note, these results are for a fixed var(\epsilon)=1. Higher sampling error would typically shift these curves upward, meaning that for each effect size, the same sample would yield a lower signal.


There are many take-aways from this plot.

First, for a given sample size, larger effect sizes are “detected” more easily. By detected, I mean found to be statistically significant using the .05 threshold. It’s possible to detect larger effect sizes (e.g. .25) with relatively low sample sizes (in this case <10). By contrast, if the effect size is small (e.g. .05), then a larger sample is needed to detect the effect (>10).

Second, this figure illustrates an oft-heard warning about p-values: always interpret them within the context of sample size. Lack of statistical significance does not imply lack of an effect. An effect may exist, but the sample size may be insufficient to detect it (or the variability in the data set is too high). On the other hand, just because a p-value signals statistical significance does not mean that the effect is actually meaningful. Consider an effect size of .00000001 (effectively 0). According to the chart, even the p-value of this effect size tends to 0 as the sample size increases, eventually crossing the statistical significance threshold.

To leave a comment for the author, please follow the link and comment on their blog: R – Stable Markets.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)