R, academia and the democratization of statistics

December 12, 2011

(This article was first published on Quantum Forest » rblogs, and kindly contributed to R-bloggers)

I am not a statistician but I use statistics, teach some statistics and write about applications of statistics in biological problems.

Last week I was in this biostatistics conference, talking with a Ph.D. student who was surprised about this situation because I didn’t have any statistical training. I corrected “any formal training”. On the first day one of the invited speakers was musing about the growing number of “amateurs” using statistics—many times wrongly—and about what biostatisticians could offer as professional value-adding. Yes, he was talking about people like me spoiling the party.

Twenty years ago it was more difficult to say “cool, I have some data and I will run some statistical analyses” because (and you can easily see where I am going here) access to statistical software was difficult. You were among the lucky ones if you were based at a university or a large company, because you had access to SAS, SPSS, MINITAB, etc. However, you were out of luck outside of these environments, because there was no way to easily afford a personal licence, not for a hobby at least. This greatly limited the pool of people that could afford to muck around with stats.

Gratuitous picture: spiral in Sydney

Enter R and other free (sensu gratis) software that allowed us to skip the specialist, skip the university or the large organization. Do you need a formal degree in statistics to start running analyses? Do you even need to go through a university (for any degree, it doesn’t really matter) to do so? There are plenty of resources to start, download R or a matrix language, get online tutorials and books, read, read, read and ask questions in email lists or fora when you get stuck. If you are a clever cookie—and some of you clearly are one—you could easily cover as much ground as someone going through a university degree. It is probably still not enough to settle down, but it is a great start and a great improvement over the situation twenty years ago.

This description leaves three groups in trouble, trying to sort out their “value-adding” ability: academia, (bio)statisticians and software makers. What are universities offering, that’s unique enough, to justify the time and money invested by individuals and governments? If you make software, What makes it special? For how long can you rely on tradition and inertia so people don’t switch to something else? What’s so special about your (bio) statistical training to justify having one “you” in the organization? Too many questions, so better I go to sleep.

To leave a comment for the author, please follow the link and comment on their blog: Quantum Forest » rblogs.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: , , ,

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



CRC R books series

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)