An example of monkey patching a package

August 1, 2013
By

(This article was first published on R2D2, and kindly contributed to R-bloggers)

  • An example of monkey patching a package

  • 2013-07-11
  • Source

Scope

This article is about R package development.

Motivation

In the same spirit that my previous post A dirty hack for importing packages that use Depends , I wanted to use an earlier version of the excellent gdata package in one of my packages, but as an Import instead of a Depend.

At that time gdata had a bug that prevented certain functions from being used through Import. I would like to show you the hack I used to make it work. (Gregory Warnes has now fixed gdata by releasing a new version, and I am very grateful for his responsiveness and efficiency, and for allowing me to use gdata as an illustration.)

Analysis of the problem

gdata used path.package("gdata") to locate its files at runtime. But path.package can only locate a package that is attached (to the search path), and that is exactly what we want to avoid.

The idea of the hack

The function to use to locate a package directory when the package is not attached is find.package. What we would like is to somehow modify ---at runtime--- the gdata package and replace the calls to path.package by calls to find.package.

When executing a call to a gdata function, R locates the path.package symbol by first looking in the gdata's namespace, then in the gdata's imports environment, then in the base/core namespace, then in the search path.

Because all these namespaces are sealed, and because the prime directive is not to modify the search path, we can not alter or add a new definition in those environments. Butwe can insert a new environment somewhere in the chain between gdata's namespace and the base environment which provides an alternate definition of path.package.

Illustration

We are going to insert a new environment between gdata's imports environment and its parent. This environment will contain a symbol named path.package, but which have the same definition as find.package (i.e. path.package will point to find.package).

plot of chunk before

plot of chunk after

Implementation

To achieve this, we will provide a custom .onLoad function in MyPkg, which is executed when the package is loaded. We also define a special symbol to avoid repeated insertion of a new environment in the case that our package is unloaded then reloaded.

.onLoad <- function(libname, pkgname) {

    # Monkey patch so that gdata uses find.package instead of path.package
    gdata_imports <- parent.env(getNamespace('gdata'))
    current <- parent.env(gdata_imports)

    HACK <- '__hack__'
    # set the new env between gdata imports and base env, only if not already done
    if (! exists(HACK, envir=parent.env(gdata_imports) ) ) {
        # make a new env, with path.package poiting to find.package
        env <- new.env(parent=current)
        assign(HACK, TRUE, envir=env)
        assign('path.package', find.package, envir=env)   ### define path.package=find.package
        parent.env(gdata_imports) <- env                  ### insert our custom env
    }
}

Conclusion

This is an example of the kind of trick we can achieve in our objective of trustworthy computation. Once again, this trick is no longer needed for the current version of gdata.

Karl Forner @ Quartz Bio (with help from Gregory Warnes)

To leave a comment for the author, please follow the link and comment on his blog: R2D2.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.