[This article was first published on R – What You're Doing Is Rather Desperate, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

“Some R functions have an awful lot of arguments”, you think to yourself. “I wonder which has the most?”

It’s not an original thought: the same question as applied to the R base package is an exercise in the Functions chapter of the excellent Advanced R. Much of the information in this post came from there.

There are lots of R packages. We’ll limit ourselves to those packages which ship with R, and which load on startup. Which ones are they?

What packages load on starting R?
Start a new R session and type search(). Here’s the result on my machine:

 search() [1] ".GlobalEnv" "tools:rstudio" "package:stats" "package:graphics" "package:grDevices" "package:utils" "package:datasets" "package:methods" "Autoloads" "package:base" 

We’re interested in the packages with priority = base. Next question:

How can I see and filter for package priority?
You don’t need dplyr for this, but it helps.

library(tidyverse)

installed.packages() %>%
as.tibble() %>%
filter(Priority == "base") %>%
select(Package, Priority)

# A tibble: 14 x 2
Package   Priority
<chr>     <chr>
1 base      base
2 compiler  base
3 datasets  base
4 graphics  base
5 grDevices base
6 grid      base
7 methods   base
8 parallel  base
9 splines   base
10 stats     base
11 stats4    base
12 tcltk     base
13 tools     base
14 utils     base


Comparing to the output from search(), we want to look at: stats, graphics, grDevices, utils, datasets, methods and base.

How can I see all the objects in a package?
Like this, for the base package. For other packages, just change base to the package name of interest.

ls("package:base")


However, not every object in a package is a function. Next question:

How do I know if an object is a function?
The simplest way is to use is.function().

is.function(ls)
[1] TRUE


What if the function name is stored as a character variable, “ls”? Then we can use get():

is.function(get("ls"))
[1] TRUE


But wait: what if two functions from different packages have the same name and we have loaded both of those packages? Then we specify the package too, using the pos argument.

is.function(get("Position", pos = "package:base"))
[1] TRUE
is.function(get("Position", pos = "package:ggplot2"))
[1] FALSE


So far, so good. Now, to the arguments.

How do I see the arguments to a function?
Now things start to get interesting. In R, function arguments are called formals. There is a function of the same name, formals(), to show the arguments for a function. You can also use formalArgs() which returns a vector with just the argument names:

formalArgs(ls)
[1] "name"      "pos"       "envir"     "all.names" "pattern"   "sorted"


But that won’t work for every function. Let’s try abs():

formalArgs(abs)
NULL


The issue here is that abs() is a primitive function, and primitives don’t have formals. Our next two questions:

How do I know if an object is a primitive?
Hopefully you guessed that one:

is.primitive(abs)
[1] TRUE


How do I see the arguments to a primitive?
You can use args(), and you can pass the output of args() to formals() or formalArgs():

args(abs)
function (x)
NULL

formalArgs(args(abs))
[1] "x"


However, there are a few objects which are primitive functions for which this doesn’t work. Let’s not worry about those.

is.primitive(:)
[1] TRUE

formalArgs(args(:))
NULL
Warning message:
In formals(fun) : argument is not a function


So what was the original question again?
Let’s put all that together. We want to find the base packages which load on startup, list their objects, identify which are functions or primitive functions, list their arguments and count them up.

We’ll create a tibble by pasting the arguments for each function into a comma-separated string, then pulling the string apart using unnest_tokens() from the tidytext package.

library(tidytext)
library(tidyverse)

pkgs <- installed.packages() %>%
as.tibble() %>%
filter(Priority == "base",
Package %in% c("stats", "graphics", "grDevices", "utils",
"datasets", "methods", "base")) %>%
select(Package) %>%
rowwise() %>%
mutate(fnames = paste(ls(paste0("package:", Package)), collapse = ",")) %>%
unnest_tokens(fname, fnames, token = stringr::str_split,
pattern = ",", to_lower = FALSE) %>%
filter(is.function(get(fname, pos = paste0("package:", Package)))) %>%
mutate(is_primitive = ifelse(is.primitive(get(fname, pos = paste0("package:", Package))),
1,
0),
num_args = ifelse(is.primitive(get(fname, pos = paste0("package:", Package))),
length(formalArgs(args(fname))),
length(formalArgs(fname)))) %>%
ungroup()


That throws out a few warnings where, as noted, args() doesn’t work for some primitives.

And the winner is –

pkgs %>%
top_n(10) %>%
arrange(desc(num_args))

Selecting by num_args
# A tibble: 10 x 4
Package  fname            is_primitive num_args
<chr>    <chr>                   <dbl>    <int>
1 graphics legend                      0       39
2 graphics stars                       0       33
3 graphics barplot.default             0       30
4 stats    termplot                    0       28
6 stats    heatmap                     0       24
7 base     scan                        0       22
8 graphics filled.contour              0       21
9 graphics hist.default                0       21
10 stats    interaction.plot            0       21


– the function legend() from the graphics package, with 39 arguments. From the base package itself, scan(), with 22 arguments.

Just to wrap up, some histograms of argument number by package, suggesting that the base graphics functions tend to be the more verbose.

pkgs %>%
ggplot(aes(num_args)) +
geom_histogram() +
facet_wrap(~Package, scales = "free_y") +
theme_bw() +
labs(x = "arguments", title = "R base function arguments by package")