# Tests, Power and Significance

October 14, 2015
By

(This article was first published on Freakonometrics » R-english, and kindly contributed to R-bloggers)

In the mathematical statistics course today, we started talking about tests, and decision rules. To illustrate all the concepts introduced today, we considered the case where we have a sample  with . And we want to test

against

In the course, we’ve seen that we could use a test based on the order statistics .  The test would be

i.e. if  we choose , and if , we choose .

From the definition of the first order risk,

we can easily get that

Thus, the power is then

To visualize it, use the following parameters

```n=5
alpha=.1
theta0=1```

Then

```C1=theta0*(1-alpha)^(1/n)
theta=seq(0,2,by=.01)
P1=(1-(theta0/theta)^n*(1-alpha))*(theta>C1)
plot(theta,P1,type="l",lwd=2,col="blue",xlab="",ylab="Power")```

Note that, so far, we did never consider the maximum of our sample. Assume that the maximum is , then we can compute the -value,

Here it is

```PV=(1-theta^n)*(theta<=1)
plot(theta,PV,type="l",lwd=2,col="blue",xlab="",ylab="p-value")```

Now, why not consider another test, based on the minimum (since we have the distribution of the minimum of a sample from a uniform distribution). The test is the same as before

but here, the threshold is

The power of the test is here

This test has the same significance level (by construction), but the power of the test is clearly lower than the one we got using the maximum of our sample, when

```C2=theta0*(1-alpha^(1/n))
P2=(1-(theta0/theta)*(1-alpha^(1/n)))^n*(theta>C2)
lines(theta,P2,type="l",lwd=2,col="red")```

Why not consider a test based on ? The problem is that we need the distribution (more specifically the survival distribution) of . We can compute it, numerically. But that might be painful. An alternative is to consider some approximation, based on the central limit theorem, i.e.

Our test is based on , and to get the same significance as before, use

The power of the test is then

Here it is

```mu=2*(theta0/2)
s2=2^2*(theta0^2/12)/n
C3=qnorm(1-alpha,mu,sqrt(s2))
(P=1-pnorm(C3,theta,sqrt(s2)))*(theta>C3)
lines(theta,P)```

Observe here that the test based on the maximum is not more powerful than the one based on the average (I just wonder if it could be due to the Gaussian approximation…).

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...