R Package Tutorial

November 7, 2018
By

(This article was first published on R on YIHAN WU, and kindly contributed to R-bloggers)

Edited by Rob Colautti. Originally posted on https://colauttilab.github.io/biol812.html on March 15th.

Most of the general content can be found in Hadley Wickham’s R Packages book available for free online. It goes into detail on almost everything you would need to know to make a package.

For a quick refresher, see Hilary Parker’s post on a “cat” function.

Install packages first and then read on.

install.packages(c("devtools", "roxygen2", "testthat", "knitr"))

The objectives: 1) Make a basic package in RStudio and make 1 function. 2) Make documentation for the function. 3) Installing the package and input checking.

Introduction

You start with a piece of code, and the best practice is to comment in the code to explain what the code does.

# take x, square it and add one to it
y <- x^2 + 1

When you use that piece of code multiple times, it’s easier to make it into a function and call it.

square_plus<-function(x){
  y <-x^2 + 1
  return(y)
}

When you have one or several functions that you use frequently, you should make a package so that you can load all the functions easily and quickly, and also so you can share the functions.

square_plus<-function(x){
  y <-x^2 + 1
  return(y)
}

cube_plus<-function(x){
  y <-x^3 + 1
  return(y)
}

quartic_plus<-function(x){
  y <-x^4 + 1
  return(y)
}

To start a package in RStudio, go to File -> New Project -> New Directory -> R Package using devtools. You can also use the “R Package” option but delete the NAMESPACE file as it will be automatically generated later. Give the package a name and then click create.

RStudio should load and there will be a file structure with several files and two folders, “R” and “man”.

The “R” folder is for code, and there is a hello.R file in it.

The “man” folder is for manual pages, the documents that show up when you use ?some_function.

Make a test function

Today, we are going to make a function to get public references from the Crossref API. Crossref is one of the organizations for Digital Object Identifiers and is frequently the one used for scientific journals. Crossref has “metadata” on digital objects such as type of object, author, dates etc etc.

We can access this information through the Crossref page.

For example, Primack and Miller-Rushing’s 2011 paper.

https://search.crossref.org/?q=Broadening+the+study+of+phenology+and+climate+change

Clicking on the “Actions” button and then “Metadata as JSON” brings up a json file including citation information, and also citations for the papers referenced in the paper. The DOI for this work is “10.1111/j.1469-8137.2011.03773.x”

We can get this file in R using the doi and using the api. (See the Crossref API here). Doing this will give us a list of the citations. There’s been times where we read a paper and then go through the references of the paper, especially for literature reviews/meta-analyses and this will likely save some time.

# download jsonlite to parse json files
library(jsonlite)
url<-"https://api.crossref.org/works/10.1111/j.1469-8137.2011.03773.x"
result<-fromJSON(url)

result is a list. result$message$reference is a data frame with 17 references.

We can extract this.

references<-as.data.frame(result$message$reference)

This can be easily writen to csv or other formats.

But we can also make this a function for all DOI.

get_work_references<-function(DOI){
  url<-paste0("https://api.crossref.org/works/",DOI)
  result<-fromJSON(url)
  return(as.data.frame(result$message$reference))
}

We can save the script above in the R folder. Objective 1 is done.

Adding documentation

We can source the file, and then run it. But we still lack documentation for this function, and there isn’t a library to load.

The first piece of documentation is the DESCRIPTION file.

There are several fields to fill. The package name is already filled. Add a title (ie. This Package Gets References).

Change the Author to [email protected] and then add yourself as the author and creator.

[email protected]: person("First Name", "Last Name", email="[email protected]", role=c("aut", cre))

# Two authors
[email protected]: c(person("First Name", "Last Name", email="[email protected]", role=c("aut", cre)),
             person("Second person name", "second person last name", email="[email protected]", role="aut"))

Write a description: Interfacing with Crossref’s API to get citation information using DOI. This package uses jsonlite and is only one function.

Use one of the public licenses (GPL-3, MIT etc.)

Save the DESCRIPTION file.

We will now add documentation to make manual pages.

The package to use is roxygen2. Previously, people made man pages manually in Latex, but roxygen2 means we can write in the script, and the man pages are automatically generated.

First thing to do: take out any library(*) commands and use packagename::function() for any functions from other libraries. Read R Packages – R code for more details on why.

Roxygen2 commands start with #'.

We need to add details such as description, useage, arguments.

The first line is automatically the title field. Use one line. Then the following text paragraph goes into the description. The usage field is automatically generated. Use @param tags for arguments. (Only 1 in this case). Use the next line to write a longer description. Use @return to write what is expected output and @example to write example code that will be run when creating the man page. We also want to use a @export tag so that the function will be available for use when the library is loaded.

The script would look something like this:

#' Takes a DOI and returns references for the object.
#'
#' This function queries the Crossref API to obtain a data frame of references for the DOI. We use the paste0 function from base and the fromJSON function from jsonlite.
#'
#' @param DOI String. Digital object identifier.
#'
#' @return data frame of references.
#' @example
#' get_work_references("10.1111/j.1469-8137.2011.03773.x")
#' @export

get_work_references<-function(DOI){
  url<-paste0("https://api.crossref.org/works/",DOI)
  result<-jsonlite::fromJSON(url)
  return(as.data.frame(result$message$reference))
}

Save the file and now use devtools::document().

We will now have a NAMESPACE file, and a new file within the man folder. The NAMESPACE file shows the function we have which will be available in the environment when the library is loaded.

Open get_work_references.Rd and then click preview to see how it looks.

However, our man page is a bit dull, and lacks the links most pages have. We have to add the links using code. For example, linking the paste0 function will be \code{\link[base]{paste0}}.

Use document() again. Now the functions are in monospace font. The actual links only appear when the package is built.

We can use the “CHECK” button on the “Build Pane” to check for any issues in the package.

We did not import the jsonlite package. To do this, go back to the DESCRIPTION file and add:

      jsonlite

Another CHECK will tell you that the package curl is required. Add this to the imports as well.

Installing the package

Once you pass the check, click “Install and Restart” to install the package. The package should be in your “Packages” pane.

Doing ?get_work_references will bring up the help page with working links. We can successfully run the example. But if input is not a character, the function doesn’t work.

Input checking

You cannot account for every possible scenario where the function doesn’t work. Or there are certain variables you know have to be in a specific form.

You can add checks for inputs within the function.

For example, the DOI should be a character string. We can add a test for the input and stop the function with an error if the input isn’t a character string.

get_work_references<-function(DOI){
  if (!is.character(DOI)) stop(" 'DOI' must be a character string")
  url<-paste0("https://api.crossref.org/works/",DOI)
  result<-jsonlite::fromJSON(url)
  return(as.data.frame(result$message$reference))
}

Reinstall and you have a working package!

References:

Hadley Wickham, ‘R packages’ http://r-pkgs.had.co.nz/

Hilary Parker, ‘Writing an R package from scratch’, https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/

To leave a comment for the author, please follow the link and comment on their blog: R on YIHAN WU.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)