Blog Archives

dplyrXdf 0.90 now available

December 6, 2016

by Hong Ooi, Sr. Data Scientist, Microsoft Version 0.90 of the dplyrXdf package has just been released. dplyrXdf is a package that brings dplyr pipelines and data transformation verbs to Microsoft R Server’s xdf files. This version includes several changes, mostly to address performance and efficiency concerns, which I’ll detail these below. The .outFile argument All dplyrXdf verbs now...

glmnetUtils: quality of life enhancements for elastic net regression with glmnet

November 1, 2016

The glmnetUtils package provides a collection of tools to streamline the process of fitting elastic net models with glmnet. I wrote the package after a couple of projects where I found myself writing the same boilerplate code to convert a data frame into a predictor matrix and a response vector. In addition to providing a formula interface, it also...

Updated dplyrXdf package brings data munging with pipes to Xdf files

March 16, 2016

by Hong Ooi, Sr. Data Scientist, Microsoft I’m pleased to announce the release of version 0.62 of the dplyrXdf package, a backend to dplyr that allows the use of pipeline syntax with Microsoft R Server’s Xdf files. This update adds a new verb (persist), fills some holes in support for dplyr verbs, and fixes various bugs. The persist verb...

Introducing the dplyrXdf package

October 20, 2015

The dplyr package is a popular toolkit for data transformation and manipulation. Over the last year and a half, dplyr has become a hot topic in the R community, for the way in which it streamlines and simplifies many common data manipulation tasks. Out of the box, dplyr supports data frames, data tables (from the data.table package), and the...

