# R: Interval Estimation of the Population Mean

[This article was first published on

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

**Analysis with Programming**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Interval estimation of the population mean can be computed from functions of the following R packages:

From the data, we obtain the following information: (i) the sample size is more than 30, and (ii) the population standard deviation is known. Therefore, the appropriate test is z-test. And the function to use is

Aside from the confidence interval, the function returns also the computed z-statistics with p-value, and as well as the point estimate of the mean. To get rid of this, one can add a suffix

The appropriate test for this is t-test since the sample size is small, n < 30, and the population variance is unknown. And thus,

Often in textbooks, however, we are presented with summary statistics of the data like the next example below from

In this example, no dataset is given, but we have the computed mean = 7.5 of this dataset, standard deviation = 2.5, and sample size = 49. Thus, to compute for the interval estimate of the population mean in R, we use the

The

- stats – contains the
`t.test`

; - TeachingDemos – contains the
`z.test`

; and, - BSDA – contains the
`zsum.test`

and`tsum.test`

.

`t.test`

of the stats package is a student’s t test, and is use when raw dataset is given. The same case for `z.test`

, but this function is specifically for z-test of known population standard deviation. When dataset is not given and only the summary statistics (mean, and standard deviation) are presented, then the appropriate functions are `zsum.test`

or `tsum.test`

. Note that, `t.test`

and `tsum.test`

are functions of the same statistical test, and that of `z.test`

and `zsum.test`

. Consider the example below,**Example 1**. The 2012-2013 SASE scores of the 33 random students from College of Science and Mathematics (CSM) of MSU-IIT were recorded: 84, 93, 101, 86, 82, 86, 88, 94, 89, 94, 93, 83, 95, 86, 94, 87, 91, 96, 89, 79, 99, 98, 81, 80, 88, 100, 90, 100, 81, 98, 87, 95, and 94. The population of these scores are believe to be normally distributed with 6.8 standard deviation. Determine and interpret the 95% and 99% confidence interval of the population mean.From the data, we obtain the following information: (i) the sample size is more than 30, and (ii) the population standard deviation is known. Therefore, the appropriate test is z-test. And the function to use is

`z.test`

, that is**Interpretation**: The true mean of all SASE scores in the school year 2013-2014 from CSM is likely between 88.01327 and 92.65340 (95% CI). And the true mean of all SASE scores for the said college and school year is likely between 87.28425 and 93.38241 (99% CI).Aside from the confidence interval, the function returns also the computed z-statistics with p-value, and as well as the point estimate of the mean. To get rid of this, one can add a suffix

**$conf.int**to the function to extract the confidence interval only.**Example 2**. The following data (341, 345, 338, 339, 340, 343, 341, 343, 341, 328, 343, 347, 337, 348, and 339) are random samples from normally distributed population. Compute and interpret the 90% confidence interval.The appropriate test for this is t-test since the sample size is small, n < 30, and the population variance is unknown. And thus,

**Interpretation**: The true mean of the population of the given data above is likely between 285.5911 and 356.1423 (90% CI).Often in textbooks, however, we are presented with summary statistics of the data like the next example below from

*Simplified Biostatistics*by Abubakar S. Asaad.**Example 3**. The biostatistician took a random sample of 49 patients from a list of all patients ever admitted to the hospital within a three-month period and the number of drugs prescribed per admission was determined for each. The average drug per case was found to be 7.5 with standard deviation of 2.5. Calculate and interpret the 95% confidence interval for true mean of all the patients ever admitted to the hospital.In this example, no dataset is given, but we have the computed mean = 7.5 of this dataset, standard deviation = 2.5, and sample size = 49. Thus, to compute for the interval estimate of the population mean in R, we use the

`zsum.test`

**Interpretation**: The true mean of all the patients ever admitted to the hospital is likely between 6.800013 and 8.199987 (95% CI).The

`tsum.test`

function is used in situation like in Example 3, but this time the population variance should be unknown and the sample size should be less than 30.### Reference

Asaad, Abubakar S. (2011).*Simplified Biostatistics*. Manila: Rex Book Store, Inc.

To

**leave a comment**for the author, please follow the link and comment on their blog:**Analysis with Programming**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.