If you want to read the original article, click here Chi-Square Goodness of fit formula in R.
Chi-square goodness of fit formula, To see if a categorical variable follows a hypothesized distribution, a Chi-Square Goodness of Fit Test is utilized.
This lesson will show you how to use R to run a Chi-Square Goodness of Fit Test.
Chi-square goodness of fit formula in R
Every day, an equal number of clients enter a business, according to a vendor. To test this theory, a corporate executive records the number of customers who visit the shop in a given week and discovers the following.
Monday: 250 customers, Tuesday: 230 customers, Wednesday: 265 customers, Thursday: 235 customers, and Friday: 223 customers
To evaluate if the data is consistent with the vendor claim, do the Chi-Square goodness of fit test in R using the instructions below.
First, we’ll make two arrays to store our observed frequencies and expected customer proportions for each day.
observedfreq <- c(250, 230, 265, 235, 223) expectedprop <- c(0.2, 0.2, 0.2, 0.2, 0.2)
The expected frequency sum should be 1.
Use the Chi-Square Goodness of Fit Test to see if you’re a good fit.
Let’s see the null and alternative hypotheses for a Chi-Square Goodness of Fit Test are as follows.
H0: A variable follows a hypothesized distribution.
H1: A variable does not follow a hypothesized distribution.
The Chi-Square Goodness of Fit Test can then be performed using the chisq.test() function, which has the following syntax.
x: The observed frequencies are represented numerically as a vector.
p: a numerical vector of proportions to be expected.
In our example, the following code demonstrates how to utilize this function.
conduct a Chi-Square Goodness-of-Fit Test
chisq.test(x= observedfreq, p= expectedprop) Chi-squared test for given probabilities data: observedfreq X-squared = 4.7265, df = 4, p-value = 0.3165
The p-value for the Chi-Square test is 0.3165, and the Chi-Square test statistic is 4.7.
The p-value is equivalent to a Chi-Square value with n-1 degrees of freedom, where n is the number of categories. degrees of freedom= 5-1 = 4 in this situation.
The Chi-Square to P-Value Calculator can be used to establish that the p-value for X2 = 4.7 with degrees of freedom= 4 is 0.3165.
We cannot reject the null hypothesis since the p-value (0.3165) is not less than 0.05.
This means we don’t have enough evidence to conclude that the genuine customer distribution differs from the vendor’s claimed distribution.
Subscribe to our newsletter!
To read more visit Chi-Square Goodness of fit formula in R.
If you are interested to learn more about data science, you can find more articles here finnstats.