The Fix Is In: Finding infix functions inside contributed R package “utilities” files
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Regular readers will recall the “utility belt” post from back in April of this year. This is a follow-up to a request made asking for a list of all the %
infix functions in those files.
We’re going to:
- collect up all of the sources
- parse them
- find all the definitions of
%
infix functions - write them to a file
We’ll start by grabbing the data from the previous post and look at it as a refresher:
library(stringi) library(tidyverse) utils
Note that we somewhat expected the file source to potentially come in handy at a later date and also expected the need to revisit that post, so the R data file [←direct link to RDS] included a file_src
column.
Now, let’s find all the source files with at least one infix definition, collect them together and parse them so we can do more code spelunking:
filter(utils, stri_detect_fixed(file_src, "`%")) %>% # only find sources with infix definitions pull(file_src) %>% paste0(collapse="\n\n") %>% parse(text = ., keep.source=TRUE) -> infix_src str(infix_src, 1) ## length 1364 expression(dplyr::`%>%`, `%||%`
We can now take all of that lovely parsed source and tokenize it to work with the discrete elements in a very tidy manner:
infix_parsed %` ## 7 5 1 5 49 51 0 expr FALSE "" ## 8 5 1 5 6 16 18 SYMBOL TRUE `%||%` ## 9 5 1 5 6 18 51 expr FALSE "" ## 10 5 8 5 9 17 51 LEFT_ASSIGN TRUE
We just need to find a sequence of tokens that make up a function definition, then whittle those down to ones that look like our %
infix names:
pat
Now, write it out to a file so we can peruse the infix functions:
# nuke a file and fill it with the function definition cat("", sep="", file="infix_functions.R") walk2( getParseText(infix_parsed, infix_parsed$id[infix_defs]), # extract the infix name getParseText(infix_parsed, infix_parsed$id[infix_defs + 3]), # extract the function definition body ~{ cat(.x, "
There are 106 of them so you can find the extracted ones in this gist.
Here’s an overview of what you can expect to find:
# A tibble: 39 x 2 name n 1 `%||%` 47 2 `%+%` 7 3 `%AND%` 4 4 `%notin%` 4 5 `%:::%` 3 6 `%==%` 3 7 `%!=%` 2 8 `%*diag%` 2 9 `%diag*%` 2 10 `%nin%` 2 11 `%OR%` 2 12 `%::%` 1 13 `%??%` 1 14 `%.%` 1 15 `%@%` 1 16 `%&&%` 1 17 `%&%` 1 18 `%+&%` 1 19 `%++%` 1 20 `%+|%` 1 21 `%%` 1 23 `%~~%` 1 24 `%assert_class%` 1 25 `%contains%` 1 26 `%din%` 1 27 `%fin%` 1 28 `%identical%` 1 29 `%In%` 1 30 `%inr%` 1 31 `%M%` 1 32 `%notchin%` 1 33 `%or%` 1 34 `%p%` 1 35 `%pin%` 1 36 `%R%` 1 37 `%s%` 1 38 `%sub_in%` 1 39 `%sub_nin%` 1
FIN
If any of those are useful, feel free to PR them in to https://github.com/hrbrmstr/freebase/blob/master/inst/templates/infix-helpers.R (and add yourself to the DESCRIPTION
if you do).
Hopefully this provided some further inspiration to continue to use R not only as your language of choice but also as a fun data source.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.