socialR: Reproducible Research & Notebook integration with R

December 10, 2010
By

(This article was first published on Carl Boettiger » R, and kindly contributed to R-bloggers)

I’ve created an R package that uses social media tools for reproducible research.  The goal of the package is this: whenever I run a code, output figures are automatically added to my figure repository (Flickr), linked to the timestamped version of the code that produced them in the code repository.  Figures should be tagged by project and be embedded selectively or automatically into this lab notebook.  The basic workflow of the notebook looks like this:[ref]Diagram of my notebook as presented at Science Online, 2011, see other slides in my entry on this.[/ref]

To do this, I use a few simple R functions that I wrap around  the system command-line programs git, flickr_upload, and hpc-autotweets to enable monitoring of my simulations through social media. The package has it’s own git repository here.  This is a rather custom development to make for rapid deployment on my own machines, and depends largely on Linux tools external to R, so it may not be easily deployed by others.  See my earlier post, Making R Twitter, for examples and back story.

Basic Features

All of these tasks are run by wrapping any plot command with my command “social_plot()”

  • Push the running code version to Github.
  • Grab the git hashtag to reference this version of the code.
  • Push figures to Flickr as they complete.  Tags images appropriately and provide link to the code (version-stable, on Github) that produces them in the description.
  • Tweet notification of a figure upload, parameter values, links to code, and timestamp.
  • Tweet when an error occurs.

Setup / Install

  1. Create a flickr account (need not be unique for the computer).
  2. Create a twitter account (preferably separate one for the machine).
  3. Install flickr_upload:
    ; sudo apt-get install libflickr-upload-perl
  4. Install tweepy:
    easy_install tweepy
  5. (See link for more detailed instructions)

  6. Configure flickr_upload credentials.
  7. Configure OAuth for tweepy.

Future modifications

Current program relies entirely on external command-line tools. Probably no easy solution to make this package self-contained and cross platform.  Still, a good bit of functionality can be added:

  • Add option to include the git log message.
  • Smart/more informative git commit messages
  • Add option/default to use truncated git commit ID numbers
  • Make Flickr discription actually link directly to code.
  • Make twitter statements include urls/actual links (to code, files)
  • Identify machine credentials?
  • Documentation still needed
  • Should verify if the git version is current
  • Grab a DOI for the object (i.e. using EZID from UC3?)

To leave a comment for the author, please follow the link and comment on his blog: Carl Boettiger » R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Tags: ,

Comments are closed.