Site icon R-bloggers

Don’t teach built-in plotting to beginners (teach ggplot2)

[This article was first published on Variance Explained, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I have some experience teaching R programming (see, for instance, my online course). One of the atypical choices I make is to start by teaching Hadley Wickham’s ggplot2 package, rather than the built-in R plotting (see these videos).

Many times that I mention this choice to others involved in statistics education, they treat it like a mistake to teach a third-party package first- even if they themselves use ggplot2 for their own plotting. Many teachers suggest I’m overestimating their students: “No, see, my students are beginners…”. If I push the point, they might insist I’m not understanding just how much of a beginner these students are, and emphasize they’re looking to keep it simple and teach the basics, and that that students can get to the advanced methods later. The difference between basic plotting and ggplot2 is thus framed as

built-in   ggplot2
“beginner” vs “expert”
“basic” vs “advanced”
“easy” vs “hard”
“simple” vs “complicated”

My claim is that this is precisely backwards. ggplot2 is easier to teach beginners, not harder, and makes constructing plots simpler, not more complicated.

This should not be a surprise. ggplot2 aims for abstraction, where the choices the you make are the ones that matter for your visualization of the data. This is exactly what you want for beginners. In the meantime, R’s basic plotting makes you construct your graph piece by piece, often making use of more advanced constructs like loops. Besides which, the fact that something was invented first doesn’t make it more basic. The fact that these functions were written in the early 1990s doesn’t somehow guarantee that they’re suited for beginners.

Here are a few specific advantages of ggplot2 for beginners:

Basic Plotting

# just getting some data
library(ggplot2)
data(diamonds)

# basic plotting
plot(diamonds$carat, diamonds$price, col = diamonds$color,
    pch = as.numeric(diamonds$cut))

ggplot2

ggplot(diamonds, aes(carat, price, col = color, shape = cut)) +
    geom_point()

Why does it matter how the plot looks? Because you’re not just teaching students how to program in R, you’re teaching them that they should. Learning to program takes effort and investment, and the more compelling the figures you can create very early in the course, the more easily you can convince them it is worth the effort.

This fits into a general principle I find myself arguing over and over, which is that you should teach your students as you would have wanted to be taught. The same goes for something like knitr– even the coders I know who swear by knitr are skeptical when I teach it to beginners. But knitr is an exceptional tool for students- it makes homework assignments in particular much easier. Why should we keep it from them?

To leave a comment for the author, please follow the link and comment on their blog: Variance Explained.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.