Groan – my first R package

[This article was first published on Odd Hypothesis, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Being one of two R experts at my current job I figured I should be familiar with package development. Frankly, I’ve been procrastinating on this topic since I started using R in 2007 – I was doing just fine with source() and the section of the R manual on package development fell into the category of TL;DR.

What finally drove me to learn the esoteric details of R packaging were the following two events at work:

  1. A coworker sent out an email announcing a new R analysis script which included a few algorithms I wrote and passed on. It was accompanied by documentation in a README.txt file and employed console menus and user dialogs to ease use. Otherwise, details of what the code was doing were left to code comments.

  2. I read an email train between coworkers after being out of the office and disconnected for a day asking about the correct set of parameters to use in a function I wrote. Fortunately, they figured it out thanks to my excessive commenting habits.

Lesson learned: I no longer write R code just for myself and using code comments as documentation just wasn’t cutting it anymore. I needed an efficient way to:

  • distribute code
  • provide documentation that uses the in-built help system

Unlike Matlab or Python, R does not have an effective way to provide simple documentation for code – functions, objects, etc. There is this post on StackOverflow, but I expect that such functionality should be built into the environment, not hacked on.

Long winded introduction over, I finally dove in. Thankfully, my entry wasn’t too rough. A couple months ago I read a couple posts submitted to R-bloggers regarding “easy” package development using roxygen2, devtools, and RStudio (my R IDE of choice).

My problem to solve: get parameters from biological growth curves
My package: groan

<span class="typ">Package</span><span class="pun">:</span><span class="pln"> groan<br /></span><span class="typ">Type</span><span class="pun">:</span><span class="pln"> </span><span class="typ">Package</span><span class="pln"><br /></span><span class="typ">Title</span><span class="pun">:</span><span class="pln"> </span><span class="typ">Utilities</span><span class="pln"> </span><span class="kwd">for</span><span class="pln"> biological </span><span class="typ">GROwth</span><span class="pln"> curve </span><span class="typ">ANalysis</span><span class="pln"><br /></span><span class="typ">Version</span><span class="pun">:</span><span class="pln"> </span><span class="lit">1.0</span><span class="pln"><br /></span><span class="typ">Date</span><span class="pun">:</span><span class="pln"> </span><span class="lit">2013</span><span class="pun">-</span><span class="lit">05</span><span class="pun">-</span><span class="lit">14</span><span class="pln"><br /></span><span class="typ">Author</span><span class="pun">:</span><span class="pln"> W</span><span class="pun">.</span><span class="pln"> </span><span class="typ">Lee</span><span class="pln"> </span><span class="typ">Pang</span><span class="pln"><br /></span><span class="typ">Description</span><span class="pun">:</span><span class="pln"> groan </span><span class="kwd">is</span><span class="pln"> a </span><span class="kwd">set</span><span class="pln"> of tools to assist </span><span class="kwd">in</span><span class="pln"> the analysis of biological<br />    growth curves</span><span class="pun">.</span><span class="pln">  </span><span class="typ">It</span><span class="pln"> provides functions to smooth input data </span><span class="kwd">and</span><span class="pln"> extract key<br />    parameters such </span><span class="kwd">as</span><span class="pln"> specific growth rate </span><span class="pun">(</span><span class="pln">mu</span><span class="pun">),</span><span class="pln"> carrying capacity </span><span class="pun">(</span><span class="pln">A</span><span class="pun">),</span><span class="pln"> </span><span class="kwd">and</span><span class="pln"> lag<br />    time </span><span class="pun">(</span><span class="pln">tau</span><span class="pun">).</span>

Working with microorganisms, a common task is determining a culture’s specific growth rate – e.g. how many times the population will double in an hour. While not a hard task, it can be tedious if numerous cultures are involved, or if the underlying data is noisy.

groan is essentially the R package I wish I had as a grad student and postdoc, but was too occupied otherwise to write.

Yes, the name groan is a pun:

  • “grown” : as in yay the cells grew
  • “groan” : as in ugh, I have to process yet another growth curve

Humor aside, it reduces a CSV of multiple growth curve data points into a table of growth parameters and a summary plot in under 10 lines of code.

From this …

Via …

<span class="pln">Y </span><span class="pun">=</span><span class="pln"> read</span><span class="pun">.</span><span class="pln">csv</span><span class="pun">(</span><span class="str">'path/to/your/data.csv'</span><span class="pun">,</span><span class="pln"> stringsAsFactors</span><span class="pun">=</span><span class="pln">F</span><span class="pun">)</span><span class="pln"><br />Y </span><span class="pun">=</span><span class="pln"> groan</span><span class="pun">.</span><span class="pln">init</span><span class="pun">(</span><span class="pln">Y</span><span class="pun">)</span><span class="pln"><br />Y</span><span class="pun">.</span><span class="pln">s </span><span class="pun">=</span><span class="pln"> groan</span><span class="pun">.</span><span class="pln">smooth</span><span class="pun">(</span><span class="pln">Y</span><span class="pun">,</span><span class="pln"> adaptive</span><span class="pun">=</span><span class="pln">T</span><span class="pun">,</span><span class="pln"> method</span><span class="pun">=</span><span class="str">'loess'</span><span class="pun">)</span><span class="pln"><br /><br />U </span><span class="pun">=</span><span class="pln"> groan</span><span class="pun">.</span><span class="pln">mu</span><span class="pun">(</span><span class="pln">Y</span><span class="pun">.</span><span class="pln">s</span><span class="pun">)</span><span class="pln"><br />U</span><span class="pun">.</span><span class="pln">s </span><span class="pun">=</span><span class="pln"> groan</span><span class="pun">.</span><span class="pln">smooth</span><span class="pun">(</span><span class="pln">U</span><span class="pun">,</span><span class="pln"> adaptive</span><span class="pun">=</span><span class="pln">T</span><span class="pun">,</span><span class="pln"> method</span><span class="pun">=</span><span class="str">'loess'</span><span class="pun">)</span><span class="pln"><br />U</span><span class="pun">.</span><span class="pln">f </span><span class="pun">=</span><span class="pln"> groan</span><span class="pun">.</span><span class="pln">fit</span><span class="pun">(</span><span class="pln">U</span><span class="pun">.</span><span class="pln">s</span><span class="pun">,</span><span class="pln"> method</span><span class="pun">=</span><span class="str">'pulse'</span><span class="pun">)</span><span class="pln"><br /><br />stats </span><span class="pun">=</span><span class="pln"> data</span><span class="pun">.</span><span class="pln">frame</span><span class="pun">(</span><span class="pln">mumax </span><span class="pun">=</span><span class="pln"> max</span><span class="pun">(</span><span class="pln">U</span><span class="pun">.</span><span class="pln">f</span><span class="pun">),</span><span class="pln"><br />                   t</span><span class="pun">.</span><span class="pln">lag </span><span class="pun">=</span><span class="pln"> groan</span><span class="pun">.</span><span class="pln">tlag</span><span class="pun">(</span><span class="pln">U</span><span class="pun">.</span><span class="pln">f</span><span class="pun">),</span><span class="pln"><br />                   gen   </span><span class="pun">=</span><span class="pln"> groan</span><span class="pun">.</span><span class="pln">generations</span><span class="pun">(</span><span class="pln">U</span><span class="pun">.</span><span class="pln">f</span><span class="pun">))</span><span class="pln"><br /><br />plot</span><span class="pun">(</span><span class="pln">Y</span><span class="pun">)</span><span class="pln">   </span><span class="com"># plot thumbnail grid of raw growth curves</span>

To …

For more information, examples, or to test out the code yourself head to the groan repository on Github.

To leave a comment for the author, please follow the link and comment on their blog: Odd Hypothesis.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)