# Peeking inside R functions

**Mark M. Fredrickson**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R, like all good programming languages, treats functions as first class objects. Users can create functions, pass them as arguments, and have them returned as the result of other computations. You may be familiar with passing functions as arguments if you have used the `apply` family of functions (i.e. `apply, sapply, lapply, mapply`). For example, to get get the median of the columns of a data frame:

```
> data(airquality)
> apply(airquality, 2, median)
Ozone Solar.R Wind Temp Month Day
NA NA 9.7 79.0 7.0 16.0
```

In this example, since some of the columns have `NA` values, the reported medians are also `NA`. We can amend the above example to drop missing values and demonstrate creating our own function to pass to `apply`:

```
> apply(airquality, 2, function(column) {
+ median(column, na.rm = T)
+ })
Ozone Solar.R Wind Temp Month Day
31.5 205.0 9.7 79.0 7.0 16.0
```

First class functions are useful in many scenarios. We can use them like objects to hold information. Here is a contrived example that creates functions that increment by a set amount. Observe that each function gets its own value of `n`, which it uses when called:

```
> adder <- function(n) {
+ function(i) {
+ n + i
+ }
+ }
> f1 <- adder(7)
> f2 <- adder(3)
> f1(10)
[1] 17
> f2(10)
[1] 13
```

Another feature of R is that functions carry their source code around with them. If ever want to know what `f1` does, we can just ask R to print out the source:

```
> f1
function (i)
{
n + i
}
<environment: 0xcdc600>
```

While the source will show us that a variable named `n` is used, it does not tell us anything about the value of `n`. We know that the value of `n` in the two functions is 7 and 3, respectively, but if functions are created programmatically, as say part of a loop, we might not know what these values are. Luckily, functions also expose their *environments*, the set of variable names and values from the surrounding scope (the `adder` function in the above example). While R does not print out these environments by default, we can use a simple helper function to peek inside the function scope:

```
> fnpeek <- function(f, name = NULL) {
+ env <- environment(f)
+ if (is.null(name)) {
+ return(ls(envir = env))
+ }
+ if (name %in% ls(envir = env)) {
+ return(get(name, env))
+ }
+ return(NULL)
+ }
> fnpeek(f1)
[1] "n"
> fnpeek(f1, "n")
[1] 7
```

If you do not have one already, go make a `~/.Rprofile` file and stick this function in there. You will use it. I promise. I recently used it to diagnose a problem that had been bugging me for some time. The problem concerned creating a series of functions. Using the `adder` example above:

```
> adders <- lapply(1:5, adder)
> sapply(adders, function(f) {
+ f(10)
+ })
[1] 15 15 15 15 15
```

The output should be `11 12 13 14 15`, but instead it is constantly 15. This is because in the loop that creates the adder functions, they all share a common `n`, which is overwritten during the loop. The `lapply` function is equivalent to:

```
> adders <- vector(mode = "list", length = 5)
> for (i in 1:5) {
+ adders[[i]] <- adder(i)
+ }
> sapply(adders, function(f) {
+ f(10)
+ })
[1] 15 15 15 15 15
```

In each loop, the `i` variable is overwritten with a new value. Since all the functions point to this single memory address, they all effectively share the same value of `n` in the function body. I suspect this is a consequence of R’s call by reference function calls. Usually this is not a problem, but in loops, call by value would have been the correct behavior. Luckily, the workaround to create call by value like behavior is relatively simple: save the value of `n` in the local environment of the outer function.

```
> safe.adders <- function(n) {
+ n <- n
+ function(i) {
+ n + i
+ }
+ }
> safe.adders <- lapply(1:5, safe.adders)
> sapply(safe.adders, function(f) {
+ f(10)
+ })
[1] 11 12 13 14 15
```

While not ideal, at least this workaround is relatively simple (especially compared to my last solution) and gets us all the benefits we would expect of first class functions.

*The version of R used in this post was 2.11.1 (2010-05-31)*

**leave a comment**for the author, please follow the link and comment on their blog:

**Mark M. Fredrickson**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.