- An optional vectorized API for efficient R programming when dealing with small records.
- Fast C implementations for serialization and deserialization from and to typedbytes.
- Other readers and writers work much better in vectorized mode, namely csv and text
- Additional steps to support structured data better (use more data frames and fewer lists in the API)
- More forgiving behavior for package loading and bug fixes
Also, the documentation has gotten a major overhaul in this version, with pages of combined text, code and graphics generated automatically using the knitr package. (RHadoop lead developer Antonio Piccolboni provides some background on how knitr is used in these documentation guidelines.)
If you haven't take a look at rmr before, this tutorial by Jeffrey Breen is a great place to get started. Otherwise, check out the wiki pages on the RHadoop github site, linked below.
github: RevolutionAnalytics / RHadoop