R Studio and "Advanced R Development"

February 5, 2014
By

(This article was first published on R-Chart, and kindly contributed to R-bloggers)



Well, I am tuning back in after a brief hiatus (what's a year-or-two-or-four between friends)?  Among other things, I just wrapped up a book for O'Reilly Media (Client-Server Web Apps with JavaScript and Java) due out in the Spring.   Please buy a copy or fifty if you are into that sort of thing.  Another project involving R has begun, and as work has commenced I have been catching up on the latest developments in the R community.  

There have been two developments that are particularly noteworthy.  A challenge that I have had in the past is that R is quite a bit different than most mainstream programming languages.  Most R users are not first and foremost programmers, but students, researchers, statisticians, and scientists.  This is part of what makes R and the R community so amazing and unique.  But it does result in a significant culture shock to programmers who have a set expectation about how software should function and developers should work.  Both new developments present R in a way that is in very much in tune with modern software development.  Rather than viewing R in isolation, they demonstrate an awareness of R's role in relation to other technologies.  They have helped me to understand and appreciate R more and will make the project more appealing to others as well.

The first is the RStudio IDE.  There have been text editors and IDEs for R for awhile, but RStudio really caught my eye recently.  It was introduced way back in 2011and is an open source project maintained on Github.   One basic, but particularly appealing feature is the great window management it provides.  You can work at the interactive prompt as you would with the bare-bones R installation or you can write and run R Code in an editor (with a style based on a TextMate style which makes me feel at home).  Separate panes are used to display plots and help pages as well.   Other panes display environment and project settings, so you have immediate visibility and a generally superior presentation that simply working at the command line.  The IDE includes a debugger and version control integration which appeals to my sensibilities as a software developer who cannot help but compare R with other languages.  

RStudio also includes functionality geared towards producing a more final, polished artifact than simply a plot.  You can generate HTML files and entire HTML presentations from Rnw files that incorporate executable R code, associated output and other page content.  The tight coupling between actual executable code and output with a final presentation eliminates mistakes and tedious tasks so common in presentations that include manually managed code and output.  The magic is done behind the scenes using Sweave or the knitr package to weave Rnw files and pdfLaTeX or XeLaTeX as the typesetting engine.   Again, such features lends credence to the idea that R is not simply a specialized tool but a more widely applicable means of communicating about data.  The folks at RStudio also have a web based offering which looks good as well and a bunch of other interesting products and projects. These also suggest R is useful for a wider user base working with and presenting data to a broader  audience.

The other significant development is Hadley Wickham's in-progress book site for "Advanced R Development".  I have a few hundred dollars worth of R books on my shelves, most directed at some specific statistical, R package or mathematical application.  R is presented in several of these texts in a brief introductory chapter or in relation to other statistical software.   I do have the "Programming with Data" (the S book by John Chambers) which I reviewed awhile back which includes more in depth language coverage.  Suffice it to say, Hadley's book is by far the best introductory presentation to R I have seen anywhere.  

He targets an audience of intermediate R developers as well as developers from other languages.  In both cases, the book provides a great deal of clarification and context for working with R.  His presentation on Data Structures is systematic and pragmatic.  A bunch of light-bulbs went off in my head as I read this, and it was helpful not only for actively programming in R but the more fundamental task of interpreting feedback returned when running R expressions at a prompt.  Go take a look if you have not!  I am sure that many R users and programmers are familiar with the material, but I have never seen it so well presented.  It is a great example of clear and articulate technical communication.    

So thanks Hadley the team at RStudio.  Your work greatly complements and enhances what is possible with R.

To leave a comment for the author, please follow the link and comment on his blog: R-Chart.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.