**Econometrics Beat: Dave Giles' Blog**, and kindly contributed to R-bloggers)

**p-values**. In either case, you have to emphasize to the classt that in order to apply the test itself, you have to know the sampling distribution of your test statistic for the situation where

*the null hypothesis is true*.

_{0},

*when in fact it is true*.

_{0}is true. Remember that the p-value is the probability of observing a value for the test statistic that’s as “extreme” (or more extreme) than the value you’ve computed using your sample data,

*if H*.

_{0}is true*proving*formally) that the

*null distribution*of the test statistic is Chi-square, Student-t, F,

*etc*.

_{0}

*is false*?

_{0}is false?”

*if the non-centrality parameter is set to zero*.

(There’s also a non-central Beta distribution, and a doubly-non-central F distribution, but I’m not going to worry about those here.)

To illustrate what’s going on here, consider the following well-known theorem:

“If x is a random n-vector that is N[0 , V], and A is a non-random (n x n) matrix, then the quadratic form, x’Ax, is Chi-square distributed with v = rank(A) degrees of freedom, if and only if AV is idempotent.”

“If x is a random n-vector that is N[μ , V], and A is a non-random (n x n) matrix, then the quadratic form, x’Ax, is

non-centralChi-square distributed with v = rank(A) degrees of freedom, and non-centrality parameter λ = ½ (μ’Aμ), if and only if AV is idempotent.

Straightforward proofs of both of these results can be found in Searle (1971, p.57), for example.

(**Warning!** The literature is split on what convention should be followed in defining λ. Many authors define it *without the “2” in the denominator.* This can be very confusing, so be aware of this.)

The density function for a non-central Chi-square random variable is quite interesting. It’s an infinite weighted sum of central Chi-square densities, with Poisson weights. That is, it’s of the form:

f(x; v, λ) = e^{-λ} Σ[λ^{k} x^{(v/2 + k – 1)} exp^{-(x/2)}] / [2^{(v/2 + k)} Γ(v/2 + k) k!] ; x ≥ 0 ,

where the range of summation is from k = 0 to k = ∞.

To illustrate things, the following plot shows what the density function for a non-central χ^{2} distribution looks like for various values of λ, when the degrees of freedom are v = 3. (The R code that I used to create this plot is available on the **code page** for this blog.)

A non-central F distribution arises when we have two independent random variables. The first is non-central Chi-square, with v_{1} degrees of freedom, and a non-centrality parameter, λ. The second is *central* Chi-square, with v_{2} degrees of freedom. The random variable,

F = [χ^{2}_{(v1,λ)} / v_{1}] / [χ^{2}_{(v2)} / v_{2}] ,

is non-central F, with v_{1} and v_{2} degrees of freedom, and non-centrality parameter, λ.

Finally, a non-central Student-t arises when we have a first random variable, X_{1}, that is N[μ , 1], and second (independent) random variable, X_{2}, that is is Chi-square with v degrees of freedom, Then the random variable,

t = X_{1} / [X_{2} / v]^{½} ,

follows non-central Student-t distribution with v degrees of freedom, and a non-centrality parameter of λ = (μ’μ)/2.

Let’s see why such non-central distributions are important in the context of hypothesis testing. Suppose, for example, that we’re conducting a test for which the test statistic follows a (central) χ^{2} distribution with v = 3 when the null hypothesis (H_{0}) is true, and a non-central χ^{2} distribution when H_{0} is false. For a 5% significance level, the critical value is 7.815, and this is shown with the green marker in the above plot. In that plot we see that as λ increases, the tail area to the right of the critical value increases (monotonically). This area is the probability of rejecting H_{0}. When λ = 0 this area is 5%. It’s just the chosen significance level – the probability of rejecting H_{0} *when it is true*.

However, for positive values of λ this area is the probability of rejecting H_{0} *when it is false*, to some degree or other. So, the areas under the red and blue curves, to the right of “crit”, give us points on the power curve associated with test.

In a follow-up post I’ll be discussing the role of the non-centrality parameter in determining the power of a test in more detail The usual F-test for linear restrictions on a regression coefficient vector will be used as an example. In addition, that post will provide some computational details.

**leave a comment**for the author, please follow the link and comment on their blog:

**Econometrics Beat: Dave Giles' Blog**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...