One sample proportion test in R-Complete Guide

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post One sample proportion test in R-Complete Guide appeared first on
Data Science Tutorials

One sample proportion test in R, when there are just two categories, the one proportion Z-test is used to compare an observed proportion to a theoretical one.

This article explains the fundamentals of the one-proportion z-test and gives examples using R software.

For example, we have a population that is half male and half female (p = 0.5 = 50%). Some of these total (n = 160), including 100 males and 60 females, acquired a spontaneous malignancy.

Artificial Intelligence Examples-Quick View – Data Science Tutorials

We’d like to know if cancer affects more men than women.

The success rate (males with cancer) is 100 percent.

The observed male proportion (po) is 100/160.

The observed female percentage (q) is 1-po.

The predicted male proportion (pe) is 0.5. (50 percent )

A total of 160 observations (n) were made.

We want to answer the following questions,

  1. whether the observed (po) and predicted (pe) proportions of males equal?
  2. is the observed male percentage (po) less than the expected male proportion (pe)?
  3. whether the observed male proportion (po) exceeds the expected male proportion (pe)?

Control Chart in Quality Control-Quick Guide – Data Science Tutorials

In statistics, the analogous null hypothesis (H0) is defined as follows.

H0:po=pe
H0:po≤pe
H0:po≥pe

The following are the relevant alternative hypothesis (H1)

H1:po≠pe (different)
H1:po>pe (greater)
H1:po<pe (less)

Note that:

Two-tailed tests are used to test hypotheses 1.

One-tailed tests are used to test hypotheses 2 and 3.

How to make a rounded corner bar plot in R? – Data Science Tutorials

The sample size is n.

If |z| is less than 1.96, the difference is not significant at 5%.
If |z| is greater than or equal to  1.96, the difference is significant at 5%.

The z-table contains the corresponding significance level (p-value) for the z-statistic. We’ll look at how to do it in R.

Compute One sample proportion test in R

binom.test() and prop.test() are R functions ()

To do a one-proportion test, use the R methods binom.test() and prop.test():

Calculate the exact binomial test with binom.test(). When the sample size is small, prop.test() is recommended.

When the sample size is large (N > 30), prop.test() can be utilised. It uses a normal approximation.

The two functions have exactly the same syntax. The following is a simplified format.

Two-Way ANOVA Example in R-Quick Guide – Data Science Tutorials

binom.test(x, n, p = 0.5, alternative = "two.sided")
prop.test(x, n, p = NULL, alternative = "two.sided", correct = TRUE)

x: the number of successes

n: the total number of trials

p: the probability to test against.

correct: a logical indicator of whether Yates’ continuity correction should be used if at all practicable.

Note that prop.test() uses the Yates continuity adjustment by default, which is critical if either the expected successes or failures is less than 5.

If you don’t want the correction, use the prop.test() function’s additional argument correct = FALSE.

TRUE is the default value. (To make the test mathematically comparable to the uncorrected z-test of a proportion, set this option to FALSE.)

We’d like to know if cancer affects more men than women.

Best GGPlot Themes You Should Know – Data Science Tutorials

We’ll utilize the prop.test function ()

prop <- prop.test(x = 100, n = 160, p = 0.5, correct = FALSE)
prop
1-sample proportions test without continuity correction
data:  100 out of 160, null probability 0.5
X-squared = 10, df = 1, p-value = 0.001565
alternative hypothesis: true p is not equal to 0.5
95 percent confidence interval:
 0.5478817 0.6962568
sample estimates:
    p
0.625

The function returns,

  1. the value of Pearson’s chi-squared test statistic.
  2. p-value
  3. 95% confidence intervals
  4. Estimated probability of success (the proportion of males with cancer)

Keep in mind,

If you wish to see if the percentage of men with cancer is less than 0.5 (one-tailed test), enter:

prop.test(x = 100, n = 160, p = 0.5, correct = FALSE, alternative = "less")

Alternatively, type this to see if the fraction of men with cancer is more than 0.5 (one-tailed test):

prop.test(x = 100, n = 160, p = 0.5, correct = FALSE, alternative = "greater")

Conclusion

The test’s p-value is 0.001565, which is less than the alpha = 0.05 significance level. With a p-value of 0.001565, we may conclude that the proportion of males with cancer is substantially different from 0.5.

The post One sample proportion test in R-Complete Guide appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)