A while ago I was playing around with the JavaScript package D3.js,
and I began with this visualization—that I never really finished—of how
a oneway ANOVA is calculated. I wanted to make the visualization
interactive, and I did integrate some interactive elements. For
instance, if you hover over a data point it will show the residual, and
its value will be highlighted in the combined computation. The circle
diagram show the partitioning of the sums of squares, and if you hover a
part it will show from where the variation is coming. I tried to make
the plots look like plots from the Rpackage ggplot2.
These plots are not designed to work on mobile phones.
Let’s check the calculations in R
To se if this works, let’s compute the ANOVA as I have described it
here.

# data
grp1 < c(1,2,3,4)
grp2 < c(5,6,7,8)
grp3 < c(9,10,11,12)


# total SS
total_SS < sum((c(grp1, grp2, grp3)  mean(c(grp1, grp2, grp3)))^2)
total_SS


# within groups SS
within_SS < sum((c(grp1  mean(grp1), grp2  mean(grp2), grp3  mean(grp3)))^2)
within_SS


# within groups SS
within_SS < sum((c(grp1  mean(grp1), grp2  mean(grp2), grp3  mean(grp3)))^2)
within_SS


# between groups
between_SS < 4*(sum((c(mean(grp1), mean(grp2), mean(grp3))^2  mean(df$y)^2)))
between_SS


# check calculation
between_SS + within_SS == total_SS
[1] TRUE

We see that total_SS, between_SS and within_SS are identical to
what is shown above in the visualization.

df1 < 31 # number of groups  1
df2 < 12  3 # N  number of groups
F < (between_SS/df1) / (within_SS/df2)
F


1pf(F, df1, df2) # pvalue

Let’s compare this to anova()

df < data.frame(y=c(grp1,grp2,grp3))
df$group < gl(3,4)
anova(lm(y ~ group, df))

Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
group 2 128 64.000 38.4 3.921e05 ***
Residuals 9 15 1.667

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
We have identical results.
Related
To
leave a comment for the author, please follow the link and comment on their blog:
R Psychologist.
Rbloggers.com offers
daily email updates about
R news and
tutorials on topics such as:
Data science,
Big Data, R jobs, visualization (
ggplot2,
Boxplots,
maps,
animation), programming (
RStudio,
Sweave,
LaTeX,
SQL,
Eclipse,
git,
hadoop,
Web Scraping) statistics (
regression,
PCA,
time series,
trading) and more...
If you got this far, why not
subscribe for updates from the site? Choose your flavor:
email,
twitter,
RSS, or
facebook...