# Example 8.39: calculating Cramer’s V

[This article was first published on

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Cramer’s V is a measure of association for nominal variables. Effectively it is the Pearson chi-square statistic rescaled to have values between 0 and 1, as follows:**SAS and R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

V = sqrt(X^2 / [nobs * (min(ncols, nrows) – 1)])

where X^2 is the Pearson chi-square,

As an example, we’ll revisit the table of homelessness vs. gender we present in Section 2.6.3.

**SAS**

In SAS, Cramer’s V is provided when the

`chisq`option to the

`tables`statement is used, in

`proc freq`.

proc freq data = "c:\book\help.sas7bdat"; tables female*homeless / chisq; run;

resulting in

Statistics for Table of FEMALE by HOMELESS Statistic DF Value Prob ------------------------------------------------------ Chi-Square 1 4.3196 0.0377 Likelihood Ratio Chi-Square 1 4.3654 0.0367 Continuity Adj. Chi-Square 1 3.8708 0.0491 Mantel-Haenszel Chi-Square 1 4.3101 0.0379 Phi Coefficient -0.0977 Contingency Coefficient 0.0972 Cramer's V -0.0977

where (as usual) several additional values are also included. The negative value shown for Cramer’s V is odd– it’s unclear what rationale should be used for using the negative root. According to the documentation, this is only a possibility for 2 by 2 tables.

**R**

As far as we know, Cramer’s V is not included in base R. Of course, it is easy to assemble directly. We found one version on line. However, this requires a table as input, so we’ve rewritten it here to accept vector input instead.

Here’s the function, which uses

`unique()`(section 1.4.16) to extract the values of the rows and columns and

`length()`(Section 1.4.15) to find their number and the number of observations. A more bullet-proof version of the function would check to ensure the two vectors are of equal length (or allow the input in a variety of formats).

cv.test = function(x,y) { CV = sqrt(chisq.test(x, y, correct=FALSE)$statistic / (length(x) * (min(length(unique(x)),length(unique(y))) - 1))) print.noquote("Cramér V / Phi:") return(as.numeric(CV)) }

So we can get Cramer’s V as

helpdata = read.csv("http://www.math.smith.edu/r/data/help.csv") with(helpdata, cv.test(female, homeless) [1] Cramér V / Phi: [1] 0.09765063

To

**leave a comment**for the author, please follow the link and comment on their blog:**SAS and R**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.