Argument Matching Across Languages

Posted on August 5, 2023 by Jonathan Carroll in R bloggers | 0 Comments

[This article was first published on rstats on Irregularly Scheduled Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

With Functional Programming, we write functions which take arguments and do something with or based on those arguments. You might not think there’s much to learn about given that tiny description of “an argument to a function” but the syntax and mechanics of different languages is actually widely variable and intricate.

Let’s say I have some function in R that takes three arguments, x, y, and z, and just prints them out in a string in that order.

r_fun <- function(x, y, z) {
  sprintf("arguments are: %s, %s, %s", x, y, z)
}

Calling this function with good practices (specifying all the argument names in full) would look like this

r_fun(x = "a", y = "b", z = "c")
## [1] "arguments are: a, b, c"

I said “in full” because by default, R will happily do partial matching, so long as it can uniquely figure out which argument you mean

long_args <- function(alphabet = "a to z", altitude = 100) {
  print(sprintf("alphabet: %s", alphabet))
  print(sprintf("altitude: %d", altitude))
}
long_args(alphabet = "[A-Z]", altitude = 50)
## [1] "alphabet: [A-Z]"
## [1] "altitude: 50"

In this case, both arguments start with "al" so it’s ambiguous up to there

long_args(al = "letters")
## Error in long_args(al = "letters"): argument 1 matches multiple formal arguments

but we only need to specify enough letters to disambiguate

long_args(alpha = "LETTERS", alt = 200)
## [1] "alphabet: LETTERS"
## [1] "altitude: 200"

Relying on this behaviour is dangerous, and it’s recommended to turn on warnings when this happens with

options(warnPartialMatchArgs = TRUE)
long_args(alpha = "LETTERS", alt = 200)
## Warning in long_args(alpha = "LETTERS", alt = 200): partial argument match of
## 'alpha' to 'alphabet'
## Warning in long_args(alpha = "LETTERS", alt = 200): partial argument match of
## 'alt' to 'altitude'
## [1] "alphabet: LETTERS"
## [1] "altitude: 200"

You don’t have to use argument names when calling the function, though - you can just rely on positional arguments

r_fun("a", "b", "c")
## [1] "arguments are: a, b, c"

and this is very commonly done, despite it being less clear to what any of those refer, and runs the risk that the function changes argument ordering in an updated version. It works, though.

Extensive sidenote: square-bracket matrix subsetting officially uses the (poorly? traditionally?) named arguments i and j as [i, j] but it actually entirely ignores them and uses positional arguments. The documentation (?`[`) does warn about this

“Note that these operations do not match their index arguments in the standard way: argument names are ignored and positional matching only is used. So m[j = 2, i = 1] is equivalent to m[2, 1] and not to m[1, 2].”

but it would be very easy to get bitten by it if one tried to use the names directly

m <- matrix(1:9, 3, 3, byrow = TRUE)
m
##      [,1] [,2] [,3]
## [1,]    1    2    3
## [2,]    4    5    6
## [3,]    7    8    9
m[i = 1, j = 2]
## [1] 2
m[j = 2, i = 1]
## [1] 4

Thomas Lumley notes that

“it used to be that no primitive functions did argument matching by name.”/” and “-’ and switch() and some others still don’t. I’m not sure why”[” wasn’t changed in 2.11 when a bunch of primitives got normal argument matching.”

Worse still, perhaps - the seq() function creates a sequence of values. It has the formal arguments with defaults from = 1 and to = 1 so you can calculate

seq(from = 2, to = 5)
## [1] 2 3 4 5

or you can leverage the default of from = 1

seq(to = 5)
## [1] 1 2 3 4 5

However, there are five “forms” in which you can provide arguments to this function and they behave differently. If you only specify the first argument unnamed, it treats this as to despite the first argument being from

seq(5)
## [1] 1 2 3 4 5

which is extra strange, because if you do specify to with its ostensibly default value 1, the sequence is backwards

seq(5, to = 1)
## [1] 5 4 3 2 1

Back to our function - a feature that makes R really neat is that you can specify the named arguments in any order

r_fun(z = "c", x = "a", y = "b")
## [1] "arguments are: a, b, c"

If you don’t specify them by name, R will default to positions, so specifying just one (e.g. z) but leaving the rest unspecified, R will presume you want the others in positional order

r_fun(z = "c", "a", "b")
## [1] "arguments are: a, b, c"

Where it gets really interesting is you can go back to named arguments further along and again, R will figure out that you mean the remaining unnamed argument

r_fun(z = "c", "b", x = "a")
## [1] "arguments are: a, b, c"

This only holds if the function doesn’t use the ellipses ... which captures “any other arguments” when calling the function, often to be passed on to another function. If the function signature has ... then all the unnamed arguments are captured. This example function just combines any other arguments into a comma-separated string, if there are any (tested with the under-documented ...length() which returns the number of arguments captured via ...)

dot_f <- function(a = 1, b = 2, ...) {
  print(sprintf("named arguments: %s, %s", a, b))
  if (...length()) {
    print(sprintf("additional arguments: %s", toString(list(...))))
  }
}

You can call this with just the named arguments

dot_f(a = 3, b = 4)
## [1] "named arguments: 3, 4"

or you can add more argument (no name required)

dot_f(a = 3, b = 4, 5)
## [1] "named arguments: 3, 4"
## [1] "additional arguments: 5"

As before, none of the names are really required, and we can add as many as we want

dot_f(3, 4, 5, 6, 7)
## [1] "named arguments: 3, 4"
## [1] "additional arguments: 5, 6, 7"

We can name them if we want

dot_f(a = 3, b = 4, blah = 5)
## [1] "named arguments: 3, 4"
## [1] "additional arguments: 5"

but here be danger, because those names can be anything and aren’t matched to the actual function, so this works (say, I misspelled an argument name a as A)

dot_f(A = 3, B = 4, 5)
## [1] "named arguments: 5, 2"
## [1] "additional arguments: 3, 4"

Notice that the additional arguments are the ones I named (not those in the function definition); the 5 has been positionally matched to a; and b has taken its default value of 2 because no other arguments were provided.

We can still mix up the ordering of positions, provided everything else matches up

dot_f(3, b = 4, 5)
## [1] "named arguments: 3, 4"
## [1] "additional arguments: 5"
dot_f(3, b = 4, 5, a = 2)
## [1] "named arguments: 2, 4"
## [1] "additional arguments: 3, 5"

The flexibility in all of this is what encouraged Joe Cheng to use R as an interface to HTML in the form of shiny, what he calls “a bizzarely good host language” (should link to the right timestamp) and he notes that other languages don’t let you do this sort of mixing up of named and positional arguments.

Okay, that’s R - weird and fun, but a lot of flexibility.

I saw this post mentioned in the #rust hashtag on Mastodon and had a look - it surprised me at first because I thought “what do you mean Rust doesn’t have named arguments?”…

I’ve become so used to the inline help from VSCode when I’m writing Rust that I didn’t realise I wasn’t using named arguments.

Here’s a function I wrote for my toy rock-paper-scissors game in Rust

fn play(a: Throw, b: Throw) -> GameResult {
    let result = match a.cmp(&b) {
        Ordering::Equal => GameResult::Tie,
        Ordering::Greater => GameResult::YouWin,
        Ordering::Less => GameResult::YouLose,
    };

    println!("{} {}", "Result:".purple().bold(), result);

    result
}

It has arguments a and b because I did a terrible job naming them - I knew exactly how I planned to use them, so bad luck to anyone else.

Calling that function further down in the code I have

let user = val.user();
let computer = Throw::computer();
play(user, computer);

BUT what I see in the editor has the argument names, unless I switch off hints (which I have bound to holding Ctrl+Alt at the moment)

Toggling inlay hints in VSCode

So, I can’t just rearrange arguments in Rust?

If I define a function with two arguments

>> fn two_args(a: f64, b: &str) -> String {
        let res = format!("all arguments: {a}, {b}");
        res
}

then I can call it

>> two_args(42.0, "forty-two")
"all arguments: 42, forty-two"

Just swapping the arguments obviously fails because 42.0 isn’t a &str and "forty-two" isn’t a f64. But there isn’t a way to say “the value for that argument is this”; I can’t use any of these

two_args(a = 42.0, b = "forty-two")
two_args(a: 42.0, b: "forty-two")

two_args(b = "forty-two", a = 42.0)
two_args(b: "forty-two", a: 42.0)

I suspect the fact that this was a surprise to me means I’m earlier in my Rust learning than I had thought - I clearly haven’t built anything that has functionality I didn’t directly need, because I haven’t had to worry about calling functions in strange ways yet.

There is one loophole… time to break out another cool toy: {rextendr}

library(rextendr)

rust_function(
  'fn two_args(a: f64, b: &str) -> String {
          let res = format!("all arguments: {a}, {b}");
          res
  }'
)

This produces an R function that takes two arguments, a and b which I can call as if it was an R function

two_args(a = 42, b = "forty-two")
## [1] "all arguments: 42, forty-two"

I can call it without argument names

two_args(42, "forty-two")
## [1] "all arguments: 42, forty-two"

and I can swap them

two_args(b = "forty-two", a = 42)
## [1] "all arguments: 42, forty-two"

This is just because the argument matching happens before the values get sent down to the Rust code - the function here is an R function that calls other code internally

two_args
## function (a, b) 
## .Call("wrap__two_args", a, b, PACKAGE = "librextendr1")
## <bytecode: 0x55d873cff7b8>

I somewhat started out the idea for this blogpost as I was learning some Typescript and came across this https://github.com/gibbok/typescript-book#typescript-fundamental-comparison-rules

“Function parameters are compared by types, not by their names:”

type X = (a: number) => void;
type Y = (a: number) => void;
let x: X = (j: number) => undefined;
let y: Y = (k: number) => undefined;
y = x; // Valid
x = y; // Valid

which initially struck me as strange, and I needed to work through some examples in a live setting. On reflection, I think I see that this is exactly what I would specify in e.g. Haskell - “a function that takes a number”, not “a function with an argument named a which is a number”

x :: Float -> Nothing

Because technically all functions in Haskell actually only take a single argument (the notation Int -> Int -> Int reveals this fact nicely, but in practice the notation makes it feel like multiple arguments can be used) there is no way to “pass arguments by name” but there is a neat way to swap the order of arguments that a function expects to receive; flip

flip :: (a -> b -> c) -> b -> a -> c

>>> flip (++) "hello" "world"
"worldhello"

-- or

>>> "hello" ++ "world"
"helloworld

Those of you familiar with R’s S3 dispatch functionality will perhaps note that the ‘first’ argument has a special role; it controls exactly which method will be called. If we had some function which was flexible in the sense that it could take several different ‘classes’ and do something different with them, we would write that as

flexi <- function(a, b) {
  UseMethod("flexi")
}

flexi.matrix <- function(a, b) {
  paste0("a is a matrix, b = ", b)
}

flexi.data.frame <- function(a, b) {
  paste0("a is a data.frame, b = ", b)
}

flexi.default <- function(a, b) {
  paste0("a is something else, b = ", b)
}

Now, depending on whether a is a matrix, a data.frame, or something else, one of the ‘methods’ will be called

flexi(a = matrix(), b = 7)
## [1] "a is a matrix, b = 7"
flexi(a = data.frame(), b = 8)
## [1] "a is a data.frame, b = 8"
flexi(a = 1, b = 9)
## [1] "a is something else, b = 9"

even if we swap the order of the arguments in the call

flexi(b = 3, a = matrix())
## [1] "a is a matrix, b = 3"

S4 dispatch goes even further and dispatches based on more than just the class of the first argument. Stuart Lee has a great guide on S4. The point is, you can do something different depending on what you pass to multiple arguments

s4flexi(matrix(), data.frame(), 7)
s4flexi(matrix(), data.frame(), list())
s4flexi(matrix(), data.frame(), NULL)

Julia has some of the most interesting argument parsing. I love the Haskell-like function declarations - so little boilerplate! We define some function f that takes two arguments

f(a, b) = a + b
## f (generic function with 1 method)
f(4, 5)
## 9

Similar to the Rust situation, though - these aren’t named outside of the function body, so we can’t refer to them either in that order or reversed

f(a = 4, b = 5)
MethodError: no method matching f(; a=4, b=5)
Closest candidates are:
  f(!Matched::Any, !Matched::Any) at none:3 got unsupported keyword arguments "a", "b"

The reason is that Julia uses the python-esque keyword argument syntax, where unnamed arguments appear first, followed by any keyword arguments following a ;, so we can specify these correctly as

f(; a, b) = a + b
## f (generic function with 2 methods)
f(a = 4, b = 6)
## 10

Julia is optionally typed, which means we can be flippant with the types here, or we can be very specific - we can specify that a should be an integer and b should be a string, and that produces a different method compared to what we already defined. In this case, I want to return a string with the two values

f(; a::Int, b::String) = "$a; $b"
## f (generic function with 2 methods)
f(a = 42, b = "life, universe, everything")
## "42; life, universe, everything"

Since these are now named, we can swap them

f(b = "L, U, E", a = 42)
## "42; L, U, E"

but what’s even more powerful is we can define a general method, and add type-specific methods for whatever combination of argument types we want; the first of these returns an integer, while the other two return strings

g(a, b) = a + b
## g (generic function with 1 method)
g(a::Int, b::String) = "unnamed int, string: $a; $b"
## g (generic function with 2 methods)
g(a::String, b::Int) = "unnamed string, int: $a; $b"
## g (generic function with 3 methods)

Then, depending on what types we provide in each argument, a different method is called

g(3, 2)
## 5
g("abc", 123)
## "unnamed string, int: abc; 123"
g(123, "abc")
## "unnamed int, string: 123; abc"

Similar to S4, but so easy to declare and use! Of course, this doesn’t work if we want these to be named since that would be ambiguous.

As I’m slowly learning APL, I’ve found it interesting that there’s a well-known approach of writing “point-free” (“tacit”) functions which don’t specify arguments at all.

Last of all, I’ve had the pleasure of dealing with C this week including passing a pointer to some object into a function, in which case the value outside of the function is updated. That’s a whole other post I’m working on.

How does your favourite language use arguments? Let me know! I can be found on Mastodon or use the comments below.

devtools::session_info()

## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.2 (2021-11-01)
##  os       Pop!_OS 22.04 LTS
##  system   x86_64, linux-gnu
##  ui       X11
##  language (EN)
##  collate  en_AU.UTF-8
##  ctype    en_AU.UTF-8
##  tz       Australia/Adelaide
##  date     2023-08-06
##  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  package     * version date (UTC) lib source
##  assertthat    0.2.1   2019-03-21 [3] CRAN (R 4.0.1)
##  blogdown      1.17    2023-05-16 [1] CRAN (R 4.1.2)
##  bookdown      0.29    2022-09-12 [1] CRAN (R 4.1.2)
##  brio          1.1.3   2021-11-30 [1] CRAN (R 4.1.2)
##  bslib         0.4.1   2022-11-02 [3] CRAN (R 4.2.2)
##  cachem        1.0.6   2021-08-19 [3] CRAN (R 4.2.0)
##  callr         3.7.3   2022-11-02 [3] CRAN (R 4.2.2)
##  cli           3.4.1   2022-09-23 [3] CRAN (R 4.2.1)
##  crayon        1.5.2   2022-09-29 [3] CRAN (R 4.2.1)
##  DBI           1.1.3   2022-06-18 [3] CRAN (R 4.2.1)
##  devtools      2.4.5   2022-10-11 [1] CRAN (R 4.1.2)
##  digest        0.6.30  2022-10-18 [3] CRAN (R 4.2.1)
##  dplyr         1.0.10  2022-09-01 [3] CRAN (R 4.2.1)
##  ellipsis      0.3.2   2021-04-29 [3] CRAN (R 4.1.1)
##  evaluate      0.18    2022-11-07 [3] CRAN (R 4.2.2)
##  fansi         1.0.3   2022-03-24 [3] CRAN (R 4.2.0)
##  fastmap       1.1.0   2021-01-25 [3] CRAN (R 4.2.0)
##  fs            1.5.2   2021-12-08 [3] CRAN (R 4.1.2)
##  generics      0.1.3   2022-07-05 [3] CRAN (R 4.2.1)
##  glue          1.6.2   2022-02-24 [3] CRAN (R 4.2.0)
##  htmltools     0.5.3   2022-07-18 [3] CRAN (R 4.2.1)
##  htmlwidgets   1.5.4   2021-09-08 [1] CRAN (R 4.1.2)
##  httpuv        1.6.6   2022-09-08 [1] CRAN (R 4.1.2)
##  jquerylib     0.1.4   2021-04-26 [3] CRAN (R 4.1.2)
##  jsonlite      1.8.3   2022-10-21 [3] CRAN (R 4.2.1)
##  JuliaCall     0.17.5  2022-09-08 [1] CRAN (R 4.1.2)
##  knitr         1.40    2022-08-24 [3] CRAN (R 4.2.1)
##  later         1.3.0   2021-08-18 [1] CRAN (R 4.1.2)
##  lifecycle     1.0.3   2022-10-07 [3] CRAN (R 4.2.1)
##  magrittr      2.0.3   2022-03-30 [3] CRAN (R 4.2.0)
##  memoise       2.0.1   2021-11-26 [3] CRAN (R 4.2.0)
##  mime          0.12    2021-09-28 [3] CRAN (R 4.2.0)
##  miniUI        0.1.1.1 2018-05-18 [1] CRAN (R 4.1.2)
##  pillar        1.8.1   2022-08-19 [3] CRAN (R 4.2.1)
##  pkgbuild      1.4.0   2022-11-27 [1] CRAN (R 4.1.2)
##  pkgconfig     2.0.3   2019-09-22 [3] CRAN (R 4.0.1)
##  pkgload       1.3.0   2022-06-27 [1] CRAN (R 4.1.2)
##  prettyunits   1.1.1   2020-01-24 [3] CRAN (R 4.0.1)
##  processx      3.8.0   2022-10-26 [3] CRAN (R 4.2.1)
##  profvis       0.3.7   2020-11-02 [1] CRAN (R 4.1.2)
##  promises      1.2.0.1 2021-02-11 [1] CRAN (R 4.1.2)
##  ps            1.7.2   2022-10-26 [3] CRAN (R 4.2.2)
##  purrr         1.0.1   2023-01-10 [1] CRAN (R 4.1.2)
##  R6            2.5.1   2021-08-19 [3] CRAN (R 4.2.0)
##  Rcpp          1.0.9   2022-07-08 [1] CRAN (R 4.1.2)
##  remotes       2.4.2   2021-11-30 [1] CRAN (R 4.1.2)
##  rextendr    * 0.3.0   2023-05-30 [1] CRAN (R 4.1.2)
##  rlang         1.0.6   2022-09-24 [1] CRAN (R 4.1.2)
##  rmarkdown     2.18    2022-11-09 [3] CRAN (R 4.2.2)
##  rprojroot     2.0.3   2022-04-02 [1] CRAN (R 4.1.2)
##  rstudioapi    0.14    2022-08-22 [3] CRAN (R 4.2.1)
##  sass          0.4.2   2022-07-16 [3] CRAN (R 4.2.1)
##  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.2)
##  shiny         1.7.2   2022-07-19 [1] CRAN (R 4.1.2)
##  stringi       1.7.8   2022-07-11 [3] CRAN (R 4.2.1)
##  stringr       1.5.0   2022-12-02 [1] CRAN (R 4.1.2)
##  tibble        3.1.8   2022-07-22 [3] CRAN (R 4.2.2)
##  tidyselect    1.2.0   2022-10-10 [3] CRAN (R 4.2.1)
##  urlchecker    1.0.1   2021-11-30 [1] CRAN (R 4.1.2)
##  usethis       2.1.6   2022-05-25 [1] CRAN (R 4.1.2)
##  utf8          1.2.2   2021-07-24 [3] CRAN (R 4.2.0)
##  vctrs         0.5.2   2023-01-23 [1] CRAN (R 4.1.2)
##  withr         2.5.0   2022-03-03 [3] CRAN (R 4.2.0)
##  xfun          0.34    2022-10-18 [3] CRAN (R 4.2.1)
##  xtable        1.8-4   2019-04-21 [1] CRAN (R 4.1.2)
##  yaml          2.3.6   2022-10-18 [3] CRAN (R 4.2.1)
## 
##  [1] /home/jono/R/x86_64-pc-linux-gnu-library/4.1
##  [2] /usr/local/lib/R/site-library
##  [3] /usr/lib/R/site-library
##  [4] /usr/lib/R/library
## 
## ──────────────────────────────────────────────────────────────────────────────

To leave a comment for the author, please follow the link and comment on their blog: rstats on Irregularly Scheduled Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Argument Matching Across Languages

Related

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)