Probably the most useful R function I’ve ever written

[This article was first published on R language – Burns Statistics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The function in question is scriptSearch. I’m not much for superlatives — “most” and “best” imply one dimension, but we live in a multi-dimensional world. I’m making an exception.

The statistic I have in mind for this use of “useful” is the waiting time between calls to the function divided by the human time saved by the call.

I wrote a version of this for a company where I do consulting. There are few days working there that I don’t have at least one bout with it.  Using scriptSearch can easily save half an hour compared to what I would have done prior to having the function.

scriptSearch

The two main inputs are:

  • a string to search for
  • a directory to search in

By default it only looks in R scripts in the directory (and its subdirectories).

Examples of directories to search are:

  • directory holding a large collection of R scripts
  • directory holding the source for local R packages
  • personal directory with lots of subdirectories containing R scripts and functions

Examples of uses are:

  • where is blimblam defined?
  • where are all the uses of splishsplash in the local packages (because I want to change its arguments)?
  • a few weeks ago I created a pdf called factor_history, where is the code that produced that?

These uses might be done with something like:

  • scriptSearch("blimblam *<-", "path/to/scriptFarm", sub=FALSE)
  • scriptSearch("splishsplash", "path/to/Rsource")
  • scriptSearch("factor_history", "..")

You may be confused by the asterisk in the first call.  The string to search for can be a regular expression.  In this case the asterisk means that it will find assignments whether or not there is a space between the object name and the assignment arrow.

BurStMisc

scriptSearch was the main motivation for updating the BurStMisc package to version 1.1.  The package is on CRAN.

ntile

The ntile function is also new to BurStMisc.  It returns equally-sized ordered groups from a numeric vector — for example, quintiles or deciles.

A more primitive version of the function appeared in a blog post called “Miles of iles”.  There is some discussion there of alternative functions.

writeExpectTest

While I was preparing the update to BurStMisc, I found that automating the writing of some tests using the testthat package was both warranted and feasible.  The writeExpectTest function is the result.

corner

The generally useful function that was already in BurStMisc is corner.  This allows you to see a few rows and columns of a large matrix or data frame — or higher dimensional array.

Epilogue

I want to spread the news
That if it feels this good getting used
You just keep on using me

— from “Use Me”  by Bill Withers

The post Probably the most useful R function I’ve ever written appeared first on Burns Statistics.

To leave a comment for the author, please follow the link and comment on their blog: R language – Burns Statistics.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)