Articles by John Mount

Let’s Have Some Sympathy For The Part-time R User

August 4, 2017 | John Mount

When I started writing about methods for better "parametric programming" interfaces for dplyr for R dplyr users in December of 2016 I encountered three divisions in the audience: dplyr users who had such a need, and wanted such extensions. dplyr users who did not have such a need ("we always know ...
[Read more...]

More documentation for Win-Vector R packages

July 29, 2017 | John Mount

The Win-Vector public R packages now all have new pkgdown documentation sites! (And, a thank-you to Hadley Wickham for developing the pkgdown tool.) Please check them out (hint: vtreat is our favorite). The package sites: cdata replyr seplyr sigr vtre...
[Read more...]

Tutorial: Using seplyr to Program Over dplyr

July 22, 2017 | John Mount

seplyr is an R package that makes it easy to program over dplyr 0.7.*. To illustrate this we will work an example. Suppose you had worked out a dplyr pipeline that performed an analysis you were interested in. For an example we could take something similar to one of the examples ... [Read more...]

seplyr update

July 19, 2017 | John Mount

The development version of my new R package seplyr is performing in practical applications with dplyr 0.7.* much better than even I (the seplyr package author) expected. I think I have hit a very good set of trade-offs, and I have now spent significant time creating documentation and examples. I wish ... [Read more...]

dplyr 0.7 Made Simpler

July 15, 2017 | John Mount

I have been writing a lot (too much) on the R topics dplyr/rlang/tidyeval lately. The reason is: major changes were recently announced. If you are going to use dplyr well and correctly going forward you may need to understand some of the new issues (if you don’t ... [Read more...]

Better Grouped Summaries in dplyr

July 12, 2017 | John Mount

For R dplyr users one of the promises of the new rlang/tidyeval system is an improved ability to program over dplyr itself. In particular to add new verbs that encapsulate previously compound steps into better self-documenting atomic steps. Let’s take a look at this capability. First let’s ... [Read more...]

What is magrittr’s future in the tidyverse?

July 10, 2017 | John Mount

For many R users the magrittr pipe is a popular way to arrange computation and famously part of the tidyverse. The tidyverse itself is a rapidly evolving centrally controlled package collection. The tidyverse authors publicly appear to be interested in re-basing the tidyverse in terms of their new rlang/tidyeval ...
[Read more...]

In praise of syntactic sugar

July 7, 2017 | John Mount

There has been some talk of adding native pipe notation to R (for example here, here, and here). I think a critical aspect of such an extension would be to treat such a notation as syntactic sugar and not insist such a pipe match magrittr semantics, or worse yet give ... [Read more...]

Working With R and Big Data: Use Replyr

July 6, 2017 | John Mount

In our latest R and Big Data article we discuss replyr. Why replyr replyr stands for REmote PLYing of big data for R. Why should R users try replyr? Because it lets you take a number of common working patterns and apply them to remote data (such as databases or ... [Read more...]

Join Dependency Sorting

July 1, 2017 | John Mount

In our latest installment of “R and big data” let’s again discuss the task of left joining many tables from a data warehouse using R and a system called "a join controller" (last discussed here). One of the great advantages to specifying complicated sequences of operations in data (rather ...
[Read more...]

Using wrapr::let() with tidyeval

June 28, 2017 | John Mount

While going over some of the discussion related to my last post I came up with a really neat way to use wrapr::let() and rlang/tidyeval together. Please read on to see the situation and example.Suppose we want to parameterize over a couple of names, one denoting a ... [Read more...]

wrapr Implementation Update

June 18, 2017 | John Mount

Introduction The development version of our R helper function wrapr::let() has switched from string-based substitution to abstract syntax tree based substitution (AST based subsitution, or language based substitution). I am looking for some feedback from wrapr::let() users already doing substantial work with wrapr::let(). If you are already ...
[Read more...]

Non-Standard Evaluation and Function Composition in R

June 16, 2017 | John Mount

In this article we will discuss composing standard-evaluation interfaces (SE) and composing non-standard-evaluation interfaces (NSE) in R. In R the package tidyeval/rlang is a tool for building domain specific languages intended to allow easier composition of NSE interfaces. To use it you must know some of its structure and ... [Read more...]

Use a Join Controller to Document Your Work

June 13, 2017 | John Mount

This note describes a useful replyr tool we call a "join controller" (and is part of our "R and Big Data" series, please see here for the introduction, and here for one our big data courses). When working on real world predictive modeling tasks in production, the ability to join ...
[Read more...]

Managing intermediate results when using R/sparklyr

June 9, 2017 | John Mount

In our latest “R and big data” article we show how to manage intermediate results in non-trivial Apache Spark workflows using R, sparklyr, dplyr, and replyr. Handle management Many Sparklyr tasks involve creation of intermediate or temporary tables. This can be through dplyr::copy_to() and through dplyr::compute(). These ...
[Read more...]

Campaign Response Testing no longer published on Udemy

June 8, 2017 | John Mount

Our free video course Campaign Response Testing is no longer published on Udemy. It remains available for free on YouTube with all source code available from GitHub. I’ll try to correct bad links as I find them. Please read on for the reasons. Udemy recently unilaterally instituted a new ... [Read more...]

More on safe substitution in R

June 7, 2017 | John Mount

Let’s worry a bit about substitution in R. Substitution is very powerful, which means it can be both used and mis-used. However, that does not mean every use is unsafe or a mistake. From Advanced R : substitute: We can confirm the above code performs no substitution: a
[Read more...]

There is usually more than one way in R

June 5, 2017 | John Mount

Python has a fairly famous design principle (from “PEP 20 — The Zen of Python”): There should be one– and preferably only one –obvious way to do it. Frankly in R (especially once you add many packages) there is usually more than one way. As an example we will talk about the ... [Read more...]
1 11 12 13 14 15 22

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)