Hijacking R Functions: Changing Default Arguments

[This article was first published on TRinker's R Blog » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I am working on a package to collect common regular expressions into a canned collection that users can easily use without having to know regexes. The package, qdapRegex, has a bunch of functions in the form of rm_xxx. The only difference between each function is one default parameter, the regular expression pattern is different. I had a default template function so what I really needed was to copy that template many times and change one parameter. It seems wasteful of code and electronic space to cut and paste the body of the template function over and over again…I needed to hijack the template.

Come on admit it you’ve all wished you could hijack a function before. Who hasn’t wished the default to data.frame was stringsAsFactors = FALSE? Or sum was na.rm = TRUE (OK maybe the latter is just me). So for the task of efficiently hijacking a function and changing the defaults in a manageable modular way my mind immediately went to Hadley’s pryr package (Wickham (2014)). I remember him hijacking functions in his Advanced R book as seen HERE with the partial function.

It worked except I couldn’t then change the newly set defaults back. In my case for package writing this was not a good thing (maybe there was a way and I missed it).


A Function Worth Hijacking

Here’s an example where we attempt to hijack data.frame.

dat <- data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yuck a string as a factor

## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: Factor w/ 3 levels "a","b","c": 1 2 3

Typically we’d do something like:

.data.frame <- function(..., row.names = NULL, check.rows = FALSE, check.names = TRUE,
    stringsAsFactors = FALSE) {

    data.frame(..., row.names = row.names, check.rows = check.rows,
        check.names = check.names, stringsAsFactors = stringsAsFactors)

}

dat <- .data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yay!  strings are character

## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: chr  "a" "b" "c"

But for my qdapRegex needs this required a ton of cut and paste. That means lots of extra code in the .R files.


The First Attempt to Hijack a Function

pryr to the rescue

library(pryr)

## The hijack
.data.frame <- pryr::partial(data.frame, stringsAsFactors = FALSE)

dat <- .data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yay! strings are character

## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: chr  "a" "b" "c"

But I can’t change the default back…

.data.frame(x1 = 1:3, x2 = c("a", "b", "c"), stringsAsFactors = TRUE)

## Error: formal argument "stringsAsFactors" matched by multiple actual
## arguments

Hijacking In Style (formals)

Doomed…

After tinkering with many not so reasonable solutions I asked on stackoverflow.com. In a short time MrFlick responded most helpfully (as he often does) with a response that used formals to change the formals of a function. I should have thought of it myself as I’d seen its use in Advanced R as well.

Here I use the answer to make a hijack function. It does exactly what I want, take a function and reset its formal arguments as desired.

hijack <- function (FUN, ...) {
    .FUN <- FUN
    args <- list(...)
    invisible(lapply(seq_along(args), function(i) {
        formals(.FUN)[[names(args)[i]]] <<- args[[i]]
    }))
    .FUN
}

Let’s see it in action as it changes the defaults but allows the user to still set these arguments…

.data.frame <- hijack(data.frame, stringsAsFactors = FALSE)

dat <- .data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yay! strings are character

## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: chr  "a" "b" "c"

.data.frame(x1 = 1:3, x2 = c("a", "b", "c"), stringsAsFactors = TRUE)

##   x1 x2
## 1  1  a
## 2  2  b
## 3  3  c

Note that for some purposes Dason suggested an alternative solution that is similar to the first approach I describe above but requires less copying as it used ldots (ellipsis) to cover the parameters that we don’t want to change. This approach would look something like this:

.data.frame <- function(..., stringsAsFactors = FALSE) {

    data.frame(..., stringsAsFactors = stringsAsFactors)

}

dat <- .data.frame(x1 = 1:3, x2 = c("a", "b", "c"))
str(dat)  # yay!  strings are character

## 'data.frame':    3 obs. of  2 variables:
##  $ x1: int  1 2 3
##  $ x2: chr  "a" "b" "c"

.data.frame(x1 = 1:3, x2 = c("a", "b", "c"), stringsAsFactors = TRUE)

##   x1 x2
## 1  1  a
## 2  2  b
## 3  3  c

Less verbose than the first approach I had. This solution was not the best for me in that I wanted to document all of the arguments to the function for the package. I believe using this approach would limit me to the arguments …, stringsAsFactors in the documentation (though I didn’t try it with CRAN checks). Depending on the situation this approach may be ideal.

References


*Created using the reports package


To leave a comment for the author, please follow the link and comment on their blog: TRinker's R Blog » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)