How to compare variances in R?, The F-test is used to see if two populations (A and B) have the same variances.
When should the F-test be used?
A comparison of two variations is useful in a variety of situations, including:
- When you wish to examine if the variances of two samples are equal, you can use a two-sample t-test.
- When comparing the variability of a new measurement method to that of an older one. Is the measure’s variability reduced by the new method?
Hypotheses are based on statistics and research inquiries
- whether group A’s variance (σ2A) is the same as group B’s variance (σ2B)?
- whether group A (group σ2A) has a lower variance than group B (group σ2B)?
- Does group A (group σ2A) has a higher variance than group B (group σ2B)?
In statistics, the analogous null hypothesis (H0) is defined as follows:
H0:σ2A=σ2B H0:σ2A≤σ2B H0:σ2A≥σ2B
The following are the relevant alternative hypothesis (Ha):
Ha:σ2A≠σ2B (different) Ha:σ2A>σ2B (greater) Ha:σ2A<σ2B (less)
Two-tailed tests are used to test hypotheses 1.
One-tailed tests are used to test hypotheses 2 and 3.
The F-test necessitates that the two samples be normally distributed.
How to compare variances in R
To compare two variances, use the R function var.test() as follows:
var.test(values ~ groups, data, alternative = "two.sided")
var.test(x, y, alternative = "two.sided")
x,y: numeric vectors
alternative: a different hypothesis “two.sided” (default), “greater” or “less” are the only values that can be used.
data <- ToothGrowth
To get a sense of how the data looks, we use the sample_n() function in the dplyr package to display a random sample of 10 rows.
library("dplyr") sample_n(data, 10) len supp dose 1 25.5 VC 2.0 2 14.5 VC 1.0 3 14.5 OJ 1.0 4 9.7 OJ 0.5 5 16.5 VC 1.0 6 27.3 OJ 2.0 7 9.4 OJ 0.5 8 22.5 VC 1.0 9 11.2 VC 0.5 10 8.2 OJ 0.5
In the column “supp,” we want to see if the two groups OJ and VC have the same variances.
F-test assumptions are checked with a preliminary test.
The F-test is extremely sensitive to deviations from the standard assumption. Before applying the F-test, make sure the data is normally distributed.
To see if the normal assumption holds, apply the Shapiro-Wilk test. The Q-Q plot (quantile-quantile plot) can also be used to visually analyze the normality of a variable.
The correlation between a particular sample and the normal distribution is depicted in a Q-Q plot.
If you’re not sure about the normality of your data, try Levene’s or Fligner-Killeen tests, which are less sensitive to deviations from the norm.
res.ftest <- var.test(len ~ supp, data = data) res.ftest
F test to compare two variances
data: len by supp F = 0.6386, num df = 29, denom df = 29, p-value = 0.2331 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.3039488 1.3416857 sample estimates: ratio of variances 0.6385951
The F-test has a p-value of 0.2331, which is higher than the significance level of 0.05. Finally, no significance difference exists between the two variances.