Win-Vector LLC announces new “big data in R” tools
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Win-Vector LLC is proud to introduce two important new tool families (with documentation) in the 0.5.0
version of seplyr
(also now available on CRAN):
partition_mutate_se()
/partition_mutate_qt()
: these are query planners/optimizers that work overdplyr::mutate()
assignments. When using big-data systems through R (such as PostgreSQL or Apache Spark) these planners can make your code faster and sequence steps to avoid critical issues (the complementary problems of too long in-mutate dependence chains, of too many mutate steps, and incidental bugs; all explained in the linked tutorials).if_else_device()
: provides adplyr::mutate()
based simulation of per-row conditional blocks (including conditional assignment). This allows powerful imperative code (such as often seen in porting from SAS) to be directly and legibly translated into performantdplyr::mutate()
data flow code that works on Spark (via Sparklyr) and databases.

For “big data in R” users these two function families (plus the included support functions and examples) are simple, yet game changing. These tools were developed by Win-Vector LLC to fill gaps identified by Win-Vector and our partners when standing-up production scale R plus Apache Spark projects.
We are happy to share these tools as open source, and very interested in consulting with your teams on developing R/Spark solutions (including porting existing SAS code). For more information please reach out to Win-Vector.
To teams get started we are supplying the following initial documentation, discussion, and examples:
- Mutate Partitioner package vignette
if_else_device
reference- “Partition Mutate” article
- “Partitioning Mutate, Example 2” (includes
if_else_device
) article.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.