Chi-Square test using R

[This article was first published on R tutorials – Statistical Aid: A School of Statistics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A chi-square test is used to analyze nominal (sometimes known as categorical) data. It is pronounced kai and is frequently written as a χ2 test. It’s used to compare the observed frequencies in each sample’s response categories. The null hypothesis of a chi-square test is that the nominal variables have no relationship, that they are independent. That means,

  • H0: There is no relationship between the nominal variables or variables are independent.
  • H1: H0 is not true.

chi-square test

Creating or Importing data

In this step, we have to import our data into R or we can generate a data set for example.
Let’s create some nominal data:

set.seed(150)
data <- data.frame(sampleA = sample(c("Positive","Positive","Negative"), 300, replace = TRUE), sampleB = sample(c("Positive","Positive","Negative"), 300, replace = TRUE)) Perform the chi-square test using the chisq.test function: test <- chisq.test(x = data$sampleA, y = data$sampleB) Analyse the result: > test

Pearson’s Chi-squared test with Yates’ continuity correction,

data: data$sampleA and data$sampleB
X-squared = 1.7444, df = 1, p-value = 0.1866
p-value

Interpretation of Chi-square test

To interpret the chi-square test we use p-value. If the p-value is less or equal to 0.05 then we may reject the null hypothesis that means the categorical variables are independent. The p-value is 0.1866, which is above the 5% significance level, therefore the null hypothesis cannot be rejected.

Chi-Square (χ2) statistic

A large χ2 statistic means that the null hypothesis can be rejected. To determine how large it needs to be, the critical value can be found using the degrees of freedom and the significance level.

In our example, we have 1 degree of freedom. Using a table of probabilities for the χ2 distribution (example here), we can see that the critical χ2 value is 3.841. Therefore, the null hypothesis can be rejected where χ2 >= 3.841, but in this case, it is below 3.841 and the null hypothesis, therefore, cannot be rejected.

Learn Data Science and Machine Learning

Data Analysis Using R/R Studio

The post Chi-Square test using R appeared first on Statistical Aid: A School of Statistics.

To leave a comment for the author, please follow the link and comment on their blog: R tutorials – Statistical Aid: A School of Statistics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)