It turns out that when people tell you things, you should listen. Like when Joe Rickert of Microsoft says “this is not news, please don’t repeat what I’m about to say”, you should maybe take note and keep your mouth shut.
I’m not quite sure how I missed that, but I did. So on Sunday night I wrote a blog post about what happened at the R summit. And last night Gavin Simpson (@acfagls) tweeted me to say “what was this R Consortium that I’d mentioned in my post?”. I responded with what Joe had said: this is an organisation contributed to by some big tech companies that work with R, designed to fund R infrastructure projects. I also mentioned a conversation that I’d overheard about a possible replacement for R-forge built on github, that I guess might have been related. This was talk in a bar, so I hadn’t assumed it was top secret or likely true, and I made it clear I was only repeating gossip.
It turns out that despite me deleting the tweet and editing my blog post, gossiping spreads rather quickly on twitter (who’d have thought), and consequently the news ended up on Computer World. It could have been worse, I could have ended up on Infoworld.
Anyway, I spent this evening apologising to Joe Rickert and all the R Consortium members that I could find.
R infrastructure, by which I mean the tools that you use to write R code, publish it, and consume code by others the traditionally been the responsibility of R-Core. R-Core, as well as developing R itself, maintain CRAN and the mailing lists, not to mention a good number of packages. In all my interactions with R-Core I’ve been very impressed. They are however limited by the fact that there are only 21 of them, which means that the user community outnumbers them by five orders of magnitude. There’s just a fundamental manpower bottleneck in what they can do.
In recent years, RStudio, OpenAnalytics, Revolutions Analytics (now part of Microsoft) and Tibco have been working on creating better IDEs for R. (Three of those are part of R Consortium; I’m not sure whether OpenAnalytics intend to join or not.) github and Bitbucket, while not R-specific, have taken over the code management side of things. A load of projects have been made to get R running in places that it was never designed to go (I’m thinking Renjin for R-in-Google-App-Engine, and the projects for running R inside Oracle/MonetDB/SQL Server databases, but there are many more.)
For publishing R documents, knitr has taken over the world. As well as RStudio’s RPubs facility, O’Reilly’s Atlas software lets you write in Markdown or AsciiDoc, meaning you can knit a book. I know, I’ve done it.
The trouble is, many of these projects are run by small teams in individual companies, and there hasn’t been way to grow them into bigger projects. The costs of finding out what users want, and of communicating between groups was too high.
R Consortium solves this in two ways. Firstly, it involves many of the big corporate players in R. (The R Foundation also gets at least one seat, I believe.) Having all these companies paying to sit at the same table increases the chance that they’ll speak to each other. From their point of view, they save costs by not having to implement everything themselves; for everyone else, we have the benefit of these projects being made publically available.
The other genius move is to get ideas from the community about what to build. R has suffered a little bit from the open source “if you want something, build it yourself” attitude, so having a place where you can ask other people to build things for you sounds good.
I have really high hopes for the R Consortium, and I’ll be following what they do closely. Assuming I haven’t been blacklisted by them all!*
*Please don’t let me have been blacklisted by them all.