Base-R Is Alive and Well
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As many readers of this blog know, I strongly believe that R learners should be taught base-R, not the tidyverse. Eventually the students may settle on using a mix of the two paradigms, but at the learning stage they will benefit from the fact that base-R is simple and more powerful. I’ve written my thoughts in a detailed essay.
One of the most powerful tools in base-R is tapply(), a workhorse of base-R. I give several examples in my essay in which it is much simpler and easier to use that function instead of the tidyverse.
Yet somehow there is a disdain for tapply() among many who use and teach Tidy. To them, the function is the epitome of “what’s wrong with” base-R. The latest example of this attitude arose in Twitter a few days ago, in which two Tidy supporters were mocking tapply(), treating it as a highly niche function with no value in ordinary daily usage of R. They strongly disagreed with my “workhorse” claim, until I showed them that in the code of ggplot2, Hadley has 7 calls to tapply(),
So I did a little investigation of well-known R packages by RStudio and others. The results, which I’ve added as a new section in my essay, are excerpted below.
——————————–
All the breathless claims that Tidy is more modern and clearer, whilc base-R is old-fashioned and unclear, fly in the face of the fact that RStudio developers, and authors of other prominent R packages, tend to write in base-R, not Tidy. And all of them use some base-R instead of the corresponding Tidy constructs.
package | *apply() calls | mutate() calls |
brms | 333 | 0 |
broom | 38 | 58 |
datapasta | 31 | 0 |
forecast | 82 | 0 |
future | 71 | 0 |
ggplot2 | 78 | 0 |
glmnet | 92 | 0 |
gt | 112 | 87 |
knitr | 73 | 0 |
naniar | 3 | 44 |
parsnip | 45 | 33 |
purrr | 10 | 0 |
rmarkdown | 0 | 0 |
RSQLite | 14 | 0 |
tensorflow | 32 | 0 |
tidymodels | 8 | 0 |
tidytext | 5 | 6 |
tsibble | 8 | 19 |
VIM | 117 | 19 |
Striking numbers to those who learned R via a tidyverse course. In particular, mutate() is one of the very first verbs one learns in a Tidy course, yet mutate() is used 0 times in most of the above packages. And even in the packages in which this function is called a lot, they also have plenty of calls to base-R *apply(), functions which Tidy is supposed to replace.
Now, why do these prominent R developers often use base-R, rather than the allegedly “modern and clearer” Tidy? Because base-R is easier.
And if it’s easier for them, it’s even further easier for R learners. In fact, an article discussed later in this essay, aggressively promoting Tidy, actually accuses students who use base-R instead of Tidy as taking the easy way out. Easier, indeed!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.