From ORD Sessions to R-Forge in 12 hours with RProtoBuf

October 22, 2009

(This article was first published on dirk.eddelbuettel, and kindly contributed to R-bloggers)

Yesterday, via in invitation from fellow Chicago-area
Google Summer of Code mentor Borja Sotomayor,
I attended
the Second ORD Sessions.
These are happening at the HQ of Inventable where a
couple of technologists and Open Source geeks from the Chicagoland area get together and riff
on code for a few hours after work over some pizza and beer.

Sounded good, and I needed an excuse to try to mix the awesome
Protocol Buffers with my favourite data tool, R. What
are Protocol Buffers? To quote from the Google overview page referenced above:

Protocol buffers are a flexible, efficient, automated mechanism for
serializing structured data – think XML, but smaller, faster, and
simpler. You define how you want your data to be structured once, then you
can use special generated source code to easily write and read your
structured data to and from a variety of data streams and using a variety of
languages. You can even update your data structure without breaking deployed
programs that are compiled against the “old” format.

and later on that page:

Protocol buffers are now Google’s lingua franca for data – at time of
writing, there are 48,162 different message types defined in the Google code
tree across 12,183 .proto files. They’re used both in RPC systems and for
persistent storage of data in a variety of storage systems.

So three hours later, I had an implementation of the ‘addressbook
reader’ C++ example wrapped in a tiny yet complete R package that
passed R CMD check. And one lingua franca for data has met

So before going to bed, I quickly registered a new project at
everybody’s favourite R hosting site, and thanks to the tireless Stefan Theussl (and some
favourable timezone differences) the project was approved and the stanza
available by the time I got up. So I quickly filled the SVN repo and,
presto, we had the
RProtoBuf project at
within 12 hours of the ORD Sessions hackfest. I will try to
follow up on RProtoBuf in a couple of days, this may lead to some changes in
my Rcpp R / C++
interface package as well.

To leave a comment for the author, please follow the link and comment on their blog: dirk.eddelbuettel. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

RStudio homepage

Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)