Running UBSAN tests via clang with Rocker

[This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Every now and then we get reports from CRAN about our packages failing a test there. A challenging one concerns UBSAN, or Undefined Behaviour Sanitizer. For background on UBSAN, see this RedHat blog post for gcc and this one from LLVM about clang.

I had written briefly about this before in a blog post introducing the sanitizers package for tests, as well as the corresponding package page for sanitizers, which clearly predates our follow-up Rocker.org repo / project described in this initial announcement and when we became the official R container for Docker.

Rocker had support for SAN testing, but UBSAN was not working yet. So following a recent CRAN report against our RcppAnnoy package, I was unable to replicate the error and asked for help on r-devel in this thread.

Martyn Plummer and Jan van der Laan kindly sent their configurations in the same thread and off-list; Jeff Horner did so too following an initial tweet offering help. None of these worked for me, but further trials eventually lead me to the (already mentioned above) RedHat blog post with its mention of -fno-sanitize-recover to actually have an error abort a test. Which, coupled with the settings used by Martyn, were what worked for me: clang-3.5 -fsanitize=undefined -fno-sanitize=float-divide-by-zero,vptr,function -fno-sanitize-recover.

This is now part of the updated Dockerfile of the R-devel-SAN-Clang repo behind the r-devel-ubsan-clang. It contains these settings, as well a new support script check.r for littler—which enables testing right out the box.

Here is a complete example:

docker                              # run Docker (any recent version, I use 1.2.0)
  run                               # launch a container 
    --rm                            # remove Docker temporary objects when dome
    -ti                             # use a terminal and interactive mode 
    -v $(pwd):/mnt                  # mount the current directory as /mnt in the container
    rocker/r-devel-ubsan-clang      # using the rocker/r-devel-ubsan-clang container
  check.r                           # launch the check.r command from littler (in the container)
    --setwd /mnt                    # with a setwd() to the /mnt directory
    --install-deps                  # installing all package dependencies before the test
    RcppAnnoy_0.0.5.tar.gz          # and test this tarball

I know. It is a mouthful. But it really is merely the standard practice of running Docker to launch a single command. And while I frequently make this the /bin/bash command (hence the -ti options I always use) to work and explore interactively, here we do one better thanks to the (pretty useful so far) check.r script I wrote over the last two days.

check.r does about the same as R CMD check. If you look inside check you will see a call to a (non-exported) function from the (R base-internal) tools package. We call the same function here. But to make things more interesting we also first install the package we test to really ensure we have all build-dependencies from CRAN met. (And we plan to extend check.r to support additional apt-get calls in case other libraries etc are needed.) We use the dependencies=TRUE option to have R smartly install Suggests: as well, but only one level (see help(install.packages) for details. With that prerequisite out of the way, the test can proceed as if we had done R CMD check (and additional R CMD INSTALL as well). The result for this (known-bad) package:

edd@max:~/git$ docker run --rm -ti -v $(pwd):/mnt rocker/r-devel-ubsan-clang check.r --setwd /mnt --install-deps RcppAnnoy_0.0.5.tar.gz 
also installing the dependencies ‘Rcpp’, ‘BH’, ‘RUnit’

trying URL 'http://cran.rstudio.com/src/contrib/Rcpp_0.11.3.tar.gz'
Content type 'application/x-gzip' length 2169583 bytes (2.1 MB)
opened URL
==================================================
downloaded 2.1 MB

trying URL 'http://cran.rstudio.com/src/contrib/BH_1.55.0-3.tar.gz'
Content type 'application/x-gzip' length 7860141 bytes (7.5 MB)
opened URL
==================================================
downloaded 7.5 MB

trying URL 'http://cran.rstudio.com/src/contrib/RUnit_0.4.28.tar.gz'
Content type 'application/x-gzip' length 322486 bytes (314 KB)
opened URL
==================================================
downloaded 314 KB

trying URL 'http://cran.rstudio.com/src/contrib/RcppAnnoy_0.0.4.tar.gz'
Content type 'application/x-gzip' length 25777 bytes (25 KB)
opened URL
==================================================
downloaded 25 KB

* installing *source* package ‘Rcpp’ ...
** package ‘Rcpp’ successfully unpacked and MD5 sums checked
** libs
clang++-3.5 -fsanitize=undefined -fno-sanitize=float-divide-by-zero,vptr,function -fno-sanitize-recover -I/usr/local/lib/R/include -DNDEBUG -I../inst/include/ -I/usr/local/include    -fpic  -pipe -Wall -pedantic -
g  -c Date.cpp -o Date.o

[...]
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘runUnitTests.R’
 ERROR
Running the tests in ‘tests/runUnitTests.R’ failed.
Last 13 lines of output:
  +     if (getErrors(tests)$nFail > 0) {
  +         stop("TEST FAILED!")
  +     }
  +     if (getErrors(tests)$nErr > 0) {
  +         stop("TEST HAD ERRORS!")
  +     }
  +     if (getErrors(tests)$nTestFunc < 1) {
  +         stop("NO TEST FUNCTIONS RUN!")
  +     }
  + }
  
  
  Executing test function test01getNNsByVector  ... ../inst/include/annoylib.h:532:40: runtime error: index 3 out of bounds for type 'int const[2]'
* checking PDF version of manual ... OK
* DONE

Status: 1 ERROR, 2 WARNINGs, 1 NOTE
See
  ‘/tmp/RcppAnnoy/..Rcheck/00check.log’
for details.
root@a7687c014e55:/tmp/RcppAnnoy# 

The log shows that thanks to check.r, we first download and the install the required packages Rcpp, BH, RUnit and RcppAnnoy itself (in the CRAN release). Rcpp is installed first, we then cut out the middle until we get to ... the failure we set out to confirm.

Now having a tool to confirm the error, we can work on improved code.

One such fix currently under inspection in a non-release version 0.0.5.1 then passes with the exact same invocation (but pointing at RcppAnnoy_0.0.5.1.tar.gz):

edd@max:~/git$ docker run --rm -ti -v $(pwd):/mnt rocker/r-devel-ubsan-clang check.r --setwd /mnt --install-deps RcppAnnoy_0.0.5.1.tar.gz
also installing the dependencies ‘Rcpp’, ‘BH’, ‘RUnit’
[...]
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘runUnitTests.R’
 OK
* checking PDF version of manual ... OK
* DONE

Status: 1 WARNING
See
  ‘/mnt/RcppAnnoy.Rcheck/00check.log’
for details.

edd@max:~/git$

This proceeds the same way from the same pristine, clean container for testing. It first installs the four required packages, and the proceeds to test the new and improved tarball. Which passes the test which failed above with no issues. Good.

So we now have an "appliance" container anybody can download from free from the Docker hub, and deploy as we did here in order to have fully automated, one-command setup for testing for UBSAN errors.

UBSAN is a very powerful tool. We are only beginning to deploy it. There are many more useful configuration settings. I would love to hear from anyone who would like to work on building this out via the R-devel-SAN-Clang GitHub repo. Improvements to the littler scripts are similarly welcome (and I plan on releasing an updated littler package "soon").

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

To leave a comment for the author, please follow the link and comment on their blog: Thinking inside the box .

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)