GSoC 2015: Tracking changes in performance metrics of R Code

[This article was first published on Tech and Mortals » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This blog post comes more than a week late than I had originally intended it to. But well, better late than never!

This is the first in a series of posts I will write as I work through the summer on my project as a part of Google Summer of Code 2015. I will be working for the R Project for Statistical Computing under the guidance of my mentors, Toby Dylan Hocking and Hadley Wickham. What follows is an overview of the project.

R

Project page on GitHub: https://github.com/rstats-gsoc/gsoc2015/wiki/Test-timings-on-Travis

I had chosen the project title as “Tracking changes in performance metrics of R Code” and that aptly summarizes the package that we I will be working on. The idea of this project is to provide a package that makes it easy for R package developers to track quantitative performance metrics of their code, over time. It focuses on providing changes brought over in the package’s performance metrics over subsequent development versions, most importantly relating to time and memory. It will integrate with the git version control system and Travis CI build system to provide and visualize the aforementioned metrics, among other related functions.

The way the package will go about obtaining the metrics would be as follows:

  • Store the current version of the test suite file(s) in the package directory.
  • Use git checkout to revert to the previous version and run the test(s) against that version, and obtain the performance metrics (time, space or both) in the process.
  • Repeat step 2 for a specific number of commits (say, 20).

At each step caution is maintained so that the latest version isn’t lost and the package is reverted to the latest version once the process is complete. Also, the details regarding the process to obtain the performance metrics will be explored in a later blog post.

Toby Hocking, one of my project mentors, had implemented the testThatQuantity package as a proof-of-code which can be found here.
testThatQuantity on GitHub:  https://github.com/tdhock/testthatQuantity

Below plot shows times for two tests of the Animint package over various versions as measured using the testthatQuantity package.

Animint metrics

Another key aspect of the package is the integration with r-Travis. Given the appropriate setup and presence of a git-pages branch in the package directory, the package user will be able to push the metric plots created after a successful build on Travis-CI to a GitHub page. It will be able to do so through a custom shell script which could be generated through one of the package’s functions. For those familiar with the workings of Travis, the script will be run under the after_success commands in .travis.yml. The shell script will in turn, run the R script for generating the required plots before pushing them to the GitHub page.

Sounds good, eh? Needless to say, I am quite excited and looking forward to complete the project successfully! I will be updating more of the specifics as the project progresses.


To leave a comment for the author, please follow the link and comment on their blog: Tech and Mortals » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)