Lesser known purrr tricks

(This article was first published on Econometrics and Free Software, and kindly contributed to R-bloggers)

purrr is a package that extends R’s functional programming capabilities. It brings a lot of new stuff to the table and in this post I show you some of the most useful (at least to me) functions included in purrr.

Getting rid of loops with map()

library(purrr)

numbers <- list(11, 12, 13, 14)

map_dbl(numbers, sqrt)
## [1] 3.316625 3.464102 3.605551 3.741657

You might wonder why this might be preferred to a for loop? It’s a lot less verbose, and you do not need to initialise any kind of structure to hold the result. If you google “create empty list in R” you will see that this is very common. However, with the map() family of functions, there is no need for an initial structure. map_dbl() returns an atomic list of real numbers, but if you use map() you will get a list back. Try them all out!

Map conditionally

map_if()

# Create a helper function that returns TRUE if a number is even
is_even <- function(x){
  !as.logical(x %% 2)
}

map_if(numbers, is_even, sqrt)
## [[1]]
## [1] 11
## 
## [[2]]
## [1] 3.464102
## 
## [[3]]
## [1] 13
## 
## [[4]]
## [1] 3.741657

map_at()

map_at(numbers, c(1,3), sqrt)
## [[1]]
## [1] 3.316625
## 
## [[2]]
## [1] 12
## 
## [[3]]
## [1] 3.605551
## 
## [[4]]
## [1] 14

map_if() and map_at() have a further argument than map(); in the case of map_if(), a predicate function ( a function that returns TRUE or FALSE) and a vector of positions for map_at(). This allows you to map your function only when certain conditions are met, which is also something that a lot of people google for.

Map a function with multiple arguments

numbers2 <- list(1, 2, 3, 4)

map2(numbers, numbers2, `+`)
## [[1]]
## [1] 12
## 
## [[2]]
## [1] 14
## 
## [[3]]
## [1] 16
## 
## [[4]]
## [1] 18

You can map two lists to a function which takes two arguments using map_2(). You can even map an arbitrary number of lists to any function using pmap().

By the way, try this in: `+`(1,3) and see what happens.

Don’t stop execution of your function if something goes wrong

possible_sqrt <- possibly(sqrt, otherwise = NA_real_)

numbers_with_error <- list(1, 2, 3, "spam", 4)

map(numbers_with_error, possible_sqrt)
## [[1]]
## [1] 1
## 
## [[2]]
## [1] 1.414214
## 
## [[3]]
## [1] 1.732051
## 
## [[4]]
## [1] NA
## 
## [[5]]
## [1] 2

Another very common issue is to keep running your loop even when something goes wrong. In most cases the loop simply stops at the error, but you would like it to continue and see where it failed. Try to google “skip error in a loop” or some variation of it and you’ll see that a lot of people really just want that. This is possible by combining map() and possibly(). Most solutions involve the use of tryCatch() which I personally do not find very easy to use.

Don’t stop execution of your function if something goes wrong and capture the error

safe_sqrt <- safely(sqrt, otherwise = NA_real_)

map(numbers_with_error, safe_sqrt)
## [[1]]
## [[1]]$result
## [1] 1
## 
## [[1]]$error
## NULL
## 
## 
## [[2]]
## [[2]]$result
## [1] 1.414214
## 
## [[2]]$error
## NULL
## 
## 
## [[3]]
## [[3]]$result
## [1] 1.732051
## 
## [[3]]$error
## NULL
## 
## 
## [[4]]
## [[4]]$result
## [1] NA
## 
## [[4]]$error
## 
## 
## 
## [[5]]
## [[5]]$result
## [1] 2
## 
## [[5]]$error
## NULL

safely() is very similar to possibly() but it returns a list of lists. An element is thus a list of the result and the accompagnying error message. If there is no error, the error component is NULL if there is an error, it returns the error message.

Transpose a list

safe_result_list <- map(numbers_with_error, safe_sqrt)

transpose(safe_result_list)
## $result
## $result[[1]]
## [1] 1
## 
## $result[[2]]
## [1] 1.414214
## 
## $result[[3]]
## [1] 1.732051
## 
## $result[[4]]
## [1] NA
## 
## $result[[5]]
## [1] 2
## 
## 
## $error
## $error[[1]]
## NULL
## 
## $error[[2]]
## NULL
## 
## $error[[3]]
## NULL
## 
## $error[[4]]
## 
## 
## $error[[5]]
## NULL

Here we transposed the above list. This means that we still have a list of lists, but where the first list holds all the results (which you can then access with safe_result_list$result) and the second list holds all the errors (which you can access with safe_result_list$error). This can be quite useful!

Apply a function to a lower depth of a list

transposed_list <- transpose(safe_result_list)

transposed_list %>% 
    at_depth(2, is_null)
## $result
## $result[[1]]
## [1] FALSE
## 
## $result[[2]]
## [1] FALSE
## 
## $result[[3]]
## [1] FALSE
## 
## $result[[4]]
## [1] FALSE
## 
## $result[[5]]
## [1] FALSE
## 
## 
## $error
## $error[[1]]
## [1] TRUE
## 
## $error[[2]]
## [1] TRUE
## 
## $error[[3]]
## [1] TRUE
## 
## $error[[4]]
## [1] FALSE
## 
## $error[[5]]
## [1] TRUE

Sometimes working with lists of lists can be tricky, especially when we want to apply a function to the sub-lists. This is easily done with at_depth()!

Set names of list elements

name_element <- c("sqrt()", "ok?")

set_names(transposed_list, name_element)
## $`sqrt()`
## $`sqrt()`[[1]]
## [1] 1
## 
## $`sqrt()`[[2]]
## [1] 1.414214
## 
## $`sqrt()`[[3]]
## [1] 1.732051
## 
## $`sqrt()`[[4]]
## [1] NA
## 
## $`sqrt()`[[5]]
## [1] 2
## 
## 
## $`ok?`
## $`ok?`[[1]]
## NULL
## 
## $`ok?`[[2]]
## NULL
## 
## $`ok?`[[3]]
## NULL
## 
## $`ok?`[[4]]
## 
## 
## $`ok?`[[5]]
## NULL

Reduce a list to a single value

reduce(numbers, `*`)
## [1] 24024

reduce() applies the function * iteratively to the list of numbers. There’s also accumulate():

accumulate(numbers, `*`)
## [1]    11   132  1716 24024

which keeps the intermediary results.

This function is very general, and you can reduce anything:

Matrices:

mat1 <- matrix(rnorm(10), nrow = 2)
mat2 <- matrix(rnorm(10), nrow = 2)
mat3 <- matrix(rnorm(10), nrow = 2)
list_mat <- list(mat1, mat2, mat3)

reduce(list_mat, `+`)
##             [,1]       [,2]       [,3]     [,4]      [,5]
## [1,] -2.48530177  1.0110049  0.4450388 1.280802 1.3413979
## [2,]  0.07596679 -0.6872268 -0.6579242 1.615237 0.8231933

even data frames:

df1 <- as.data.frame(mat1)
df2 <- as.data.frame(mat2)
df3 <- as.data.frame(mat3)

list_df <- list(df1, df2, df3)

reduce(list_df, dplyr::full_join)
## Joining, by = c("V1", "V2", "V3", "V4", "V5")
## Joining, by = c("V1", "V2", "V3", "V4", "V5")
##           V1         V2          V3          V4         V5
## 1 -0.6264538 -0.8356286  0.32950777  0.48742905  0.5757814
## 2  0.1836433  1.5952808 -0.82046838  0.73832471 -0.3053884
## 3 -0.8969145  1.5878453 -0.08025176  0.70795473  1.9844739
## 4  0.1848492 -1.1303757  0.13242028 -0.23969802 -0.1387870
## 5 -0.9619334  0.2587882  0.19578283  0.08541773 -1.2188574
## 6 -0.2925257 -1.1521319  0.03012394  1.11661021  1.2673687

Hope you enjoyed this list of useful functions! If you enjoy the content of my blog, you can follow me on twitter.

To leave a comment for the author, please follow the link and comment on their blog: Econometrics and Free Software.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)