# Unknown Variance Two-Tailed Test of Population Mean

February 11, 2013
By

(This article was first published on Kevin Davenport » R, and kindly contributed to R-bloggers)

Question

The mean safety audit score of ACME Co. stores in New York (n=200) was 74.3pts February last year.  Suppose we decided to sample 22 out of the 200 stores one year later. We find that the sample mean is 78.6pts and the sample standard deviation is 3.2pts.  Can we reject the null hypothesis that the sample mean score does not differ from last years true mean? That is to say can we prove that this sample mean is indicative of the population and is explained by probability.

Null hypothesis of a two-tailed test of the population mean is:

where  $\mu_{0}$ is the hypothesized value of the true population mean  $\mu$

The test statistic or output of the test (t) is defined by the sample mean ( $\bar{x}$ ), size (n), and standard deviation (s).

 

t=\frac{\bar{x}-\mu_{0}}{s /\sqrt{n}}

 

xbar <- 78.6            # sample mean
mu0 <- 74.3             # hypothesized val
s <- 3.2                # Sample stdev
n <- 22                 # Sample size
t <- (xbar-mu0)/(s/sqrt(n))
t                      # test statistic
[1] 6.302746


Now to compute the range for the .05 significance level:

a <- .05 #Alpha (significance level)
t.half.a <- qt(1-a/2, df=n-1)
c(-t.half.a, t.half.a)
[1] -2.079614  2.079614


Solution

T, our test statistic of 6.302746 does not lie between the range of -2.079614, and 2.079614. This means we can reject the null hypothesis that the mean score does not differ from last year.

Thanks to zhiqiang for the great WP LaTeX plugin and to Alex for SyntaxHighlighter Evolved.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...