Interval Estimation of the Population Mean
[This article was first published on Analysis with R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Interval estimation of the population mean can be computed from the functions of the following R packages:Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- stats – contains the t.test
- TeachingDemos – contains the z.test
- BSDA – contains the zsum.test and tsum.test
Example 1. The 2012-2013 SASE scores of the 33 random students from College of Science and Mathematics (CSM) of MSU-IIT were recorded: 84, 93, 101, 86, 82, 86, 88, 94, 89, 94, 93, 83, 95, 86, 94, 87, 91, 96, 89, 79, 99, 98, 81, 80, 88, 100, 90, 100, 81, 98, 87, 95, and 94. The population of these scores are believe to be normally distributed with 6.8 standard deviation. Determine and interpret the 95% and 99% confidence interval of the population mean.
From the data, we obtain the following information: (i) the sample size is more than 30, and (ii) the population standard deviation is known. Therefore, the appropriate test is z-test. And the function to use is z.test, that is
Interpretation: We are 95% confident that the true mean of all SASE scores in the school year 2013-2014 from CSM falls within 88.01327 and 92.65340. And we are 99% confident that the true mean of all SASE scores for the said college and school year is between 87.28425 and 93.38241.
Aside from the confidence interval, the function returns also the computed z-statistics with p-value, and as well as the point estimate of the mean. To get rid of this, one can add a suffix $conf.int to the function to extract the confidence interval only.
Example 2. The following data (341, 345, 338, 339, 340, 343, 341, 343, 341, 328, 343, 347, 337, 348, and 339) are random samples from normally distributed population. Compute and interpret the 90% confidence interval.
The appropriate test for this is t-test since the sample size is small, n < 30, and the population variance is unknown. And thus,
Interpretation: We are 90% confident that the true mean of the population of the given data above is between 285.5911 and 356.1423.
Often in textbooks, however, we are presented with summary statistics of the data like the next example below from Simplified Biostatistics by Abubakar S. Asaad.
Example 3. The biostatistician took a random sample of 49 patients from a list of all patients ever admitted to the hospital within a three-month period and the number of drugs prescribed per admission was determined for each. The average drug per case was found to be 7.5 with standard deviation of 2.5. Calculate and interpret the 95% confidence interval for true mean of all the patients ever admitted to the hospital.
In this example, no dataset is given, but we have the computed mean = 7.5 of this dataset, standard deviation = 2.5, and sample size = 49. Thus, to compute for the interval estimate of the population mean in R, we use the zsum.test
Interpretation: We are 95% confident that the true mean of all the patients ever admitted to the hospital is between 6.800013 and 8.199987.
The tsum.test function is used in situation like in Example 3, but this time the population variance should be unknown and the sample size should be less than 30.
To leave a comment for the author, please follow the link and comment on their blog: Analysis with R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.