The three-dots construct in R

[This article was first published on Burns Statistics » R language, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

There is a mechanism that allows variability in the arguments given to R functions.  Technically it is ellipsis, but more commonly called “…”, dots, dot-dot-dot or three-dots.

Basics

The three-dots allows:

  • an arbitrary number and variety of arguments
  • passing arguments on to other functions

Arbitrary arguments

The two prime cases are the c and list functions:

> c
function (..., recursive = FALSE)  .Primitive("c")
> list
function (...)  .Primitive("list")

Both of these allow you to give them as many arguments as you like, and you can name those arguments (which end up as names in the resulting object).

> c(a=42, 73.9, c=NA)
   a         c 
42.0 73.9   NA 
> names(.Last.value)
[1] "a" ""  "c"

Passing arguments on

An example is the apply function.

> args(apply)
function (X, MARGIN, FUN, ...) 
NULL

This has the three-dots as an argument.  It allows you to give additional arguments to the function that is being applied to your array (which is usually a matrix).

myMat <- array(rnorm(15), c(5,3))
myMat[5] <- NA  # treated as a vector not a matrix

We can apply mean to this matrix with and without the additional argument that says to ignore missing values:

> apply(myMat, 2, mean)
[1]        NA 0.1091856 0.1390755
> apply(myMat, 2, mean, na.rm=TRUE)
[1] 0.3000754 0.1091856 0.1390755

The na.rm does not match any arguments of apply, so it is put into the three-dots.  The function being applied — mean in this case — is called with the three-dots included and hence mean is called with its na.rm argument set.

By the way: the colMeans function is a more efficient way of doing this operation.

Recursive tangent

I had forgotten about the recursive argument to c.  One way to figure out what it does is to experiment — be a hacker in Tao Te Programming terms.

A good guess about recursive is that it has something to do with recursive objects, of which lists are the most common example.  So lets give c an atomic vector and a list and see what happens by default:

> c(2, list(3:4))
[[1]]
[1] 2

[[2]]
[1] 3 4

The result is a list — the two items are combined as is.  See Circle 8.1.57 of The R Inferno for a more informative example.

What happens when we change the value of recursive?

> c(2, list(3:4), recursive=TRUE)
[1] 2 3 4

Now the result is an atomic vector with the list flattened.  So recursive is about the combining action that the c function does (and not the type of the resulting object).

Another approach to learning about the recursive argument is to read the help file:

?c

But who does that?

Abbreviating argument names

Let’s change our last call to c by a few characters:

> c(2, list(3:4), recur=TRUE)
[[1]]
[1] 2

[[2]]
[1] 3 4

$recur
[1] TRUE

Now R thinks that we want a third component in the result.  If you are either new to R or not lazy, then this will seem natural to you. The rest of us know that, in general, you can abbreviate the names of arguments as long as the abbreviation is still unique among the function’s argument names:

> mean(myMat[,1])
[1] NA
> mean(myMat[,1], na.rm=TRUE)
[1] 0.3000754
> mean(myMat[,1], na=TRUE)
[1] 0.3000754

The three-dots makes such abbreviation problematic.  The rule is that if the argument is after the three-dots in the function definition, then it can not be abbreviated.  Such is the case with recursive in c.

More details of argument matching are presented in Circle 8.1.20 of The R Inferno.

Programming

When you write a function using the three-dots, you always have to pass it to some function (in order for it to be useful).

Very often you pass it to a particular function that you want to be called flexibly.  For example:

function(inputs, col="red", ...)
{
   #stuff
   plot(modified.data, col=col, ...)
}

Here we want to allow the user to change the look of the plot.  We might have allowed col to fall into the three-dots as well, but we wanted to specify a value that is probably not the default.

If you need more control, then you can capture the three-dots in a list and proceed:

function(inputs, ...)
{
    dots <- list(...)
    ndots <- length(dots)
    #stuff
}

Deep end

There can only be one three-dots in a function.  What to do when you want flexibility in two or more respects?

There are, of course, numerous strategies.  One of them is to use a single argument to serve for the secondary bit of flexibility.  An example is the heuristic optimization function genopt in the BurStMisc package.

> args(genopt)
function (fun, population, lower = -Inf, upper = Inf, 
    scale = dcontrol["eps"], add.args = NULL, 
    control = genopt.control(...), ...) 
NULL

The arguments that go into the three-dots are meant to be control values — arguments of the genopt.control function.  Additional arguments to the function being optimized go into the add.args argument (given as a list).  The function being optimized is ultimately called using do.call.

Circle 8.3.15 of The R Inferno has a little more along these lines.

The post The three-dots construct in R appeared first on Burns Statistics.

To leave a comment for the author, please follow the link and comment on their blog: Burns Statistics » R language.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)