Hey, I Just did a Significance Test!

[This article was first published on Statistical Research » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’ve seen it happens quite often. The sig test. Somebody simply needs to know the p-value and that one number will provide all of the information about the study that they need to know. The dataset is presented and the client/boss/colleague/etc invariably asks the question “is it significant?” and “what’s the correlation?”. To quote R.A. Fisher “To consult the statistician after an experiment is finished is often merely to ask him to conduct a post mortem examination. He can perhaps say what the experiment died of.”

So obviously in the previous example there are probably lots of problems ranging from the person making the request doesn’t know what they want to a bigger problem of the study itself may have core structural problems that undermines the overall integrity of any results. I want to focus on the former and share some thoughts on how to do a couple of hypothesis tests and selecting the appropriate test. This is mainly a topic of parametric vs. non-parametric tests. I’ll show the one most people are probably most familiar (parametric tests) and then show an alternative, more appropriate hypothesis test. I won’t get into the philosophical topics of hypothesis testing or whether .04999 is significant and .05001 is not significant. This simply provides some hypothesis testing options that too often get overlooked in market and/or social research. Yes,  parametric tests provide more power but not everything in this world fits nicely into one package with a bow on top.

Box Plot

What’s the Correlation

I’ve had to happen more often that I would care to count. I’m given a dataset and then asked what’s the “correlation”? After some back and forth I find that they want the “correlation” on all permutations (with p-value) of the questions on the questionnaire. So here goes:

raw = data.frame( replicate(2, rdiscrete(10, probs=c(1,1,1,1,1), values=c(1,2,3,4,5)) ) )

Pearson’s Correlation or, more formally, Pearson product-moment correlation coefficient. This is the correlation with which most people are familiar. Be sure to check the assumption (i.e. homoscedasticity, linearity, normality). However, Spearman’s Rho or Kendall’s Tau (depending on how you want to interpret the results) may in fact be the better options.
[sourceode language=”css”]

# My example gave the following
> cor(raw$X1,raw$X2, method=”kendall”)
[1] -0.191943
> cor(raw$X1,raw$X2, method=”spearman”)
[1] -0.2320708
> cor(raw$X1,raw$X2, method=”pearson”)
[1] -0.2573766

I Need a T-Test

This is a popular hypothesis test because people want to know if something is better (or worse) than something else.  So the initial choice is a t-test. For example did Group 1 make more money than Group 2 or did more people remember seeing an ad on theater screen 1 versus theater screen 2.  So rather than the parametric t-test we can use the Mann-Whitney U Test on our data that doesn’t meet the t-test assumptions.

# Example when data is normalized
raw2 = melt(raw, id=c("id"))

#Mann-Whitney U Test (independent 2-groups)
wilcox.test(raw$X1,raw$X2, paired=FALSE, alternative="two.sided")
wilcox.test(raw2$value~raw2$variable, paired=FALSE, alternative="two.sided")

Is There Any Association in My Table

When testing tables and counts the first go-to test is the chi-square test. But suppose you have a table like this:

2 1
3 4
raw.chisq = data.frame( id=seq(1,n),replicate(2, rdiscrete(n, probs=c(1,1), values=c(1,2)) ) )
table(raw.chisq$X1, raw.chisq$X2)

Sure you can just run you chi-square test and be done with it. But there is one small problem. The assumptions for a chi-square test are in not met. Namely, the cell sizes are way too small. So what’s a researcher to do? This is where Fisher’s Exact Test works well. If we can assume that the marginal totals are given then we can solve the problem this way:

fisher.test(raw.chisq$X1, raw.chisq$X2)

Is There a Difference in My Three Groups

Yes, it’s true, you can run three t-tests on you groups (1 vs 2, 1 vs 3, 2 vs 3). But that causes not only extra work but problems with your hypothesis test itself. Plus why do multiple tests when you can be more efficient in you testing and do just one ANOVA. Here you can do a non-parametric Kruskal-Wallis Rank Sum Test when you can’t make the assumptions for the parametric analysis of variance.

raw.anova = data.frame( id=seq(1,n),replicate(3, rdiscrete(n, probs=c(1,1,1,1,1), values=c(1,2,3,4,5)) ) )
raw.anova2 = melt(raw.anova, id=c("id"))
kruskal.test(raw.anova2$value ~ raw.anova2$variable)
fit.aov = lm(raw.anova2$value ~ raw.anova2$variable)

Ultimately, it is important to understand what you’re doing and why. Just because R, SPSS, SAS, etc. gives you a number it doesn’t mean it’s correct.  If you find that something is significant make sure you understand what is it saying and what your test.  You don’t want to run an independent 2 sample test only to find out that it should have been a matched pairs test.

To leave a comment for the author, please follow the link and comment on their blog: Statistical Research » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)