# Visualizing a One-Way ANOVA using D3.js

**R Psychologist**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A while ago I was playing around with the JavaScript package D3.js, and I began with this visualization—that I never really finished—of how a one-way ANOVA is calculated. I wanted to make the visualization interactive, and I did integrate some interactive elements. For instance, if you hover over a data point it will show the residual, and its value will be highlighted in the combined computation. The circle diagram show the partitioning of the sums of squares, and if you hover a part it will show from where the variation is coming. I tried to make the plots look like plots from the R-package ggplot2.

*These plots are not designed to work on mobile phones.*

## Let’s check the calculations in R

To se if this works, let’s compute the ANOVA as I have described it here.

1 2 3 4 | # data grp1 <- c(1,2,3,4) grp2 <- c(5,6,7,8) grp3 <- c(9,10,11,12) |

1 2 3 | # total SS total_SS <- sum((c(grp1, grp2, grp3) - mean(c(grp1, grp2, grp3)))^2) total_SS |

1 | [1] 143 |

1 2 3 | # within groups SS within_SS <- sum((c(grp1 - mean(grp1), grp2 - mean(grp2), grp3 - mean(grp3)))^2) within_SS |

1 2 3 | # within groups SS within_SS <- sum((c(grp1 - mean(grp1), grp2 - mean(grp2), grp3 - mean(grp3)))^2) within_SS |

1 | [1] 15 |

1 2 3 | # between groups between_SS <- 4*(sum((c(mean(grp1), mean(grp2), mean(grp3))^2 - mean(df$y)^2))) between_SS |

1 | [1] 128 |

1 2 3 4 | # check calculation between_SS + within_SS == total_SS [1] TRUE |

We see that *total_SS*, *between_SS* and *within_SS* are identical to
what is shown above in the visualization.

1 2 3 4 | df1 <- 3-1 # number of groups - 1 df2 <- 12 - 3 # N - number of groups F <- (between_SS/df1) / (within_SS/df2) F |

1 | [1] 38.4 |

1 | 1-pf(F, df1, df2) # p-value |

1 | [1] 3.921015e-05 |

Let's compare this to `anova()`

1 2 3 | df <- data.frame(y=c(grp1,grp2,grp3)) df$group <- gl(3,4) anova(lm(y ~ group, df)) |

Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) group 2 128 64.000 38.4 3.921e-05 *** Residuals 9 15 1.667 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

We have identical results.

**leave a comment**for the author, please follow the link and comment on their blog:

**R Psychologist**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.