[The sample average and s2 are also statistically independent if the population is Normal. For some reason, students at this level generally aren’t told that this result requires Normality.]
At this point, as a by-product of the material we’d covered, the students knew that:
- Linear combinations of Normal random variables are also Normally distributed.
- Sums of (independent) Chi-Square random variables are also Chi-Square distributed.
To do this, I’ve generated a million independent values of U1 and U2, added them together, and then plotted the result. You can do this using EViews, with the commands:
SMPL 1 1000000
If you graph Z, using the options: Distribution; Histogram; Density; Bin-width user-specified as 0.02; you get:
Using the following commands in R:
O.K., it seems that the density function is triangular in shape. [Cross-check: the area of the triangle is “1”, as it should be for a density. That’s a good start!]
Now, if you want to establish this result mathematically, rather than by simulation, there are several ways to do it. One is by taking the so-called “convolution of the densities of U1 and U2. For full details, see p.292 of the material supplied by the “Chance” team at Dartmouth College.
An alternative way of getting the density function for Z is to take the mapping from the joint density of U1 and U2 to the joint density of Z and W = (U1 – U2). The Jacobian for this transformation is 1/2. Once you have the joint density of Z and W, you can then integrate out with respect to W, to get the triangular density for Z.
This triangular distribution that emerges when you add two independent U[0 ,1] variates together is actually just a special case of the so-called Irwin-Hall distribution. The latter arises when you take the sum of, say, k independent U[0 ,1] random variables.
Here’s what the density for this sum looks like, for various choices of k:
You can see that you don’t have to have a very large value for k before the density looks rather like that of a Normal random variable, with a mean of (k/2). In fact, this gives a “quick-and dirty” way of generating a normally distributed random value. We can see this if we take k = 12, and subtract 6 from the sum:
(We don’t need to do any scaling to get the variance equal to one in value – remember that the variance of a U[0 , 1] variable is 1/12, and we’re summing 12 independent such variables.)
Of course, there are much better ways than this to generate Normal variates, but I won’t go into that here.
There’s an interesting, more general, question that we could also ask. What happens if we take the sum of independent random variables which are Uniformly distributed, but over different ranges?
In this case, things get much more complicated. There have been some interesting contributions to this problem by Mitra (1971), Sadooghi-Alvandi (2009), and others.
Hall, P., 1927. The Distribution of Means for Samples of Size N Drawn from a Population in which the Variate Takes Values Between 0 and 1, All Such Values Being Equally Probable. Biometrika, 19, 240–245.
Irwin, J.O., 1927. On the Frequency Distribution of the Means of Samples from a Population Having any Law of Frequency with Finite Moments, with Special Reference to Pearson’s Type II. Biometrika, 19, 225–239.
Mitra, S. K., 1971. On the Probability Distribution of the Sum of Uniformly Distributed Random Variables. SIAM Journal of Applied Mathematics, 20, 195-198.
Sadooghi-Alvandi, S., A. Nematollahi, & R. Habibi, 2009. On the Distribution of the Sum of Independent Uniform Random Variables. Statistical Papers, 50, 171-175.