R is indispensable, because it’s reproducible

August 31, 2010

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

Maria Wolters, self-styled "Science-Mum of two" and speech and language technology researcher, has a great blog post about the one tool she couldn't live without: R. Maria says R is her "favourite tool for analysing experimental results and modelling the resulting patterns of behaviour and preferences", and explains why:

R is a programming language for everything statistical. It’s free, it’s open source, and it’s being maintained by statisticians for statisticians. Its origin means that it is a pain to learn. It takes a while until one has cleared a path through the data structures, including the various conventions for extracting information from objects that store the results of painstaking statistical analyses, and I am still often baffled myself.

But the payoff is magnificent. Clear (modulo coding ability), open, replicable analyses. R is the ultimate in replicable research. If you give people your data set and your source code, they can repeat every single step of your reasoning. There are no paywalls, no limits of affordability, no packages that are indispensable for the analysis, but that your department hasn’t paid for.

This issue of "replicable analysis" is an important one: the ability to know that you can re-run your analysis at any time in the future (assuming you still have access to the same hardware, or at least a virtual instance of it) and verify the results, without having to worry about the software no longer being available, is crucial. It also means that third parties can reproduce your results where necessary. The fact that it really is necessary to support good science is the topic that Fritz Leisch covered in this excellent keynote speech at this year's UseR! conference.

Speech and Science: The one tool I couldn’t live without

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...


Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training




CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)