Very often people report p-values for linear regression estimates after performing variable selection step. Here is a simple simulation that shows that such a procedure might lead to wrong calibration of such tests.
We know that true values of a1 and a2 are 0.
I will use two procedures that are used to estimate the model:
- Full: we estimate the full model
- Selection: we perform model selection using AIC criterion
Below is the procedure I have used. I assume here that if variable is removed from the model by AIC criterion then it is treated as insignificant in calculations of observed p (i.e. H0 is not rejected).
xlab = “p-value”, ylab = “prop.test p-value”)
and the plot it produces:
On the plots solid black line represents model with variable selection and dashed red line – model without selection. On both plots x-axis represents different values of p (threshold p-value ranging from 0.01 to 0.1; I use the same sample for different values of p to conserve simulation time – it would probably be more prudent to use separate samples to make observed values independent).
The left plot shows us what is the relation of observed number of H0 rejections to theoretical number of rejections. It can be clearly seen that in model without variable selection the test of H0 is approximately properly calibrated (ratio is approximately equal to 1). However in model with variable selection there are over 10% more rejections than expected (too many times a variable is judged as significant).
We can simply test if this bias is statistically significant. The results are shown on the right graph. The test shows that for model without variable selection the deviations are insignificant. On the other hand – for the model with variable selection we reject H0 that the observed number of rejections is unbiased.