Parametric method for the study of the correlation: the Pearson r-test

August 3, 2009
By

(This article was first published on Statistic on aiR, and kindly contributed to R-bloggers)

Suppose you want to study whether there is a correlation between 2 sets of data. To do this we compute the Pearson product-moment correlation coefficient, which is a measure of the correlation (linear dependence) between two variables X and Y; then we compute the value of a t-test to study the significance of the Pearson coefficient R. We can use this test when the data follow a Gaussian distribution.

A new test to measure IQ is subjected to 10 volunteers. You want to see if there is a correlation between the new experimental test and the classical test, in order to replace the old test with the new test. These the values:
Old test: 15, 21, 25, 26, 30, 30, 22, 29, 19, 16
New test: 55, 56, 89, 67, 84, 89, 99, 62, 83, 88


The software R has a single function, easily recalled, which gives us directly the value of the Pearson coefficient and the t-statistical test for checking the significance of the coefficient:


a = c(15, 21, 25, 26, 30, 30, 22, 29, 19, 16)
b = c(55, 56, 89, 67, 84, 89, 99, 62, 83, 88)

cor.test(a, b)

Pearson's product-moment correlation

data: a and b
t = 0.4772, df = 8, p-value = 0.646
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
-0.5174766 0.7205107
sample estimates:
cor
0.166349


The value of the coefficient of Pearson is 0.166: it is a very low value, which indicates a poor correlation between the variables.
Furthermore, the p-value is greater than 0.05; so we reject the null hypothesis: then the Pearson coefficient is significant.
So we can say that there is no correlation between the results of both tests.

To leave a comment for the author, please follow the link and comment on his blog: Statistic on aiR.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.