Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you have qualitative variable, it is possible to verify the correlation by studying a contingency table R by C, using the Pearson’s Chi-squared test.

A casino wants to study the correlation between the modes of play and the number of winners by age group, to see if the number of winners depends on the type of game that you chose to do, in light of experience. It has the following data (number of winners / 100 player for game and age-group):

$$\begin{tabular}{c|ccc}&Age\\\hline Game&20-30&31-40&41-50\\ \hline Roulette&44&56&55\\ Black-jack& 66& 88& 23\\Poker& 15& 29& 45 \end{tabular}$$

In R, we must first build a matrix with the data collected:

table <- matrix(c(44,56,55, 66,88,23, 15,29,45), nrow=3, byrow=TRUE)

Now we can compute the chi-squared correlation coefficient:

chisq.test(table)

Pearson's Chi-squared test

data:  table
X-squared = 46.0767, df = 4, p-value = 2.374e-09

I reject the null hypothesis H0 in favor of the alternative hypothesis (p-value < 0.05): there is a strong correlation between the age of the player and his probability to win.