Searching an R Function’s Source Code

May 1, 2014
By

(This article was first published on librestats » R, and kindly contributed to R-bloggers)

This is not nearly as interesting as it might first sound, but every function in R contains R code; this is true of core R code as well as extension packages. Sometimes the R code is just a very shallow wrapper around some compiled code, such as in sum() and is.null(). Other times, as in lm.fit(), there is a vast expanse of R code.

It’s easy enough to print this source code; simply type in the function name without any parentheses or arguments. A nice way to search through that output from inside of R is to use capture.output() and then use standard regex utilities like grep(). Any standard printing to the R terminal (done via Rprintf) will be captured like a readLines() call, either inside R itself or to a file, depending on function arguments. This is R’s version of redirecting stdout with >, and here the usual caveats apply; i.e., errors and warnings are not captured:

example <- capture.output(print("asdf"))
example
#[1] "[1] \"asdf\""

example <- capture.output(warning("asdf"))
#Warning message:
#In eval(expr, envir, enclos) : asdf
example
#character(0)

example <- capture.output(stop("asdf"))
#Error in eval(expr, envir, enclos) : asdf
example
#character(0)

But otherwise, it behaves exactly like you might expect:

x <- matrix(1:30, nrow=10)
y <- capture.output(x)

y
# [1] "      [,1] [,2] [,3]" " [1,]    1   11   21" " [2,]    2   12   22"
# [4] " [3,]    3   13   23" " [4,]    4   14   24" " [5,]    5   15   25"
# [7] " [6,]    6   16   26" " [7,]    7   17   27" " [8,]    8   18   28"
#[10] " [9,]    9   19   29" "[10,]   10   20   30"

cat(paste(y, "\n"))
#      [,1] [,2] [,3] 
#  [1,]    1   11   21 
#  [2,]    2   12   22 
#  [3,]    3   13   23 
#  [4,]    4   14   24 
#  [5,]    5   15   25 
#  [6,]    6   16   26 
#  [7,]    7   17   27 
#  [8,]    8   18   28 
#  [9,]    9   19   29 
# [10,]   10   20   30 

Clearly this utility makes our original problem completely trivial. For example, say we are interested in the cov() function:

capture.output(cov)
#[1] "function (x, y = NULL, use = \"everything\", method = c(\"pearson\", "     
#[2] "    \"kendall\", \"spearman\")) "                      
#[3] "{"                                                                           
#[4] "    na.method <- pmatch(use, c(\"all.obs\", \"complete.obs\", \"pairwise.complete.obs\", " 
# and so on...

Maybe we want to see all the .Call() lines:

x <- capture.output(cov)
x[grep(x=x, pattern="[.]Call")]
#[1] "        .Call(C_cov, x, y, na.method, method == \"kendall\")"          
#[2] "            .Call(C_cov, Rank(na.omit(x)), NULL, na.method, method == "
#[3] "            .Call(C_cov, Rank(dropNA(x, nas)), Rank(dropNA(y, "        
#[4] "        .Call(C_cov, x, y, na.method, method == \"kendall\")" 

And we can quickly turn this into a useful function using a bit more of R’s expressive sneakiness:

stopper <- function(fun)
{
  stop(paste("in match_src() : function fun='", fun, "' not found", sep=""), call.=FALSE)
}

match_src <- function(fun, pattern, ignore.case=FALSE, perl=FALSE, value=FALSE, fixed=FALSE, useBytes=FALSE, invert=FALSE, remove.comments=TRUE)
{
  ### This is really too complicated, I apologize
  err <- try(test <- is.character(fun), silent=TRUE)
   
  if (inherits(x=err, what="try-error"))
    stopper(fun=deparse(substitute(fun)))
  else if (test)
  {
    err <- try(fun <- eval(as.symbol(fun)), silent=TRUE)
     
    if (inherits(x=err, what="try-error"))
      stopper(fun=fun)
  }
  err <- try(expr=src <- capture.output(fun), silent=TRUE)
  
  if (inherits(x=err, what="try-error"))
    stopper(fun=deparse(substitute(fun)))
   
   
  # Remove comments
  if (remove.comments) # test
  {
    src <- sub(src, pattern="#.*", replacement="")
    
    num.empty <- which(src == "")
    if (length(num.empty) > 0)
      src <- src[-num.empty]
     
    src <- sub(x=src, pattern="[ \t]+$", replacement="")
  }
  
  
  ### Get matches and scrub
  matches <- grep(x=src, pattern=pattern, ignore.case=ignore.case, perl=perl, value=value, fixed=fixed, useBytes=useBytes, invert=invert)
   
  src <- src[matches]
   
  # remove leading and trailing whitespace
  src <- sub(x=src, pattern="^[ \t]+|[ \t]+$", replacement="")
   
  return( src )
}

With example outputs:

match_src(match_src, pattern="comment")
#[1] "function(fun, pattern, ignore.case=FALSE, perl=FALSE, value=FALSE, fixed=FALSE, useBytes=FALSE, invert=FALSE, remove.comments=TRUE)"
#[2] "if (remove.comments)"                                                                                                               

match_src(match_src, pattern="comment", remove.comments=FALSE)
#[1] "function(fun, pattern, ignore.case=FALSE, perl=FALSE, value=FALSE, fixed=FALSE, useBytes=FALSE, invert=FALSE, remove.comments=TRUE)"
#[2] "# Remove comments"                                                                                                                  
#[3] "if (remove.comments) # test"

match_src("match_src", pattern="comment")
#[1] "function(fun, pattern, ignore.case=FALSE, perl=FALSE, value=FALSE, fixed=FALSE, useBytes=FALSE, invert=FALSE, remove.comments=TRUE)"
#[2] "if (remove.comments)"                                                                                                               

match_src(match_srcs, pattern="comment")
#Error: in match_src() : function fun='match_srcs' not found

And here’s everything in a github gist if that’s more your style.

To leave a comment for the author, please follow the link and comment on their blog: librestats » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series











Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)