Site icon R-bloggers

Adding CITATION to your R package

[This article was first published on BioCode's Notes, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Original post from Robin’s Blog:

Software is very important in science – but good software takes time and effort that could be used to do other work instead. I believe that it is important to do this work – but to make it worthwhile, people need to get credit for their work, and in academia that means citations. However, it is often very difficult to find out how to cite a piece of software – sometimes it is hidden away somewhere in the manual or on the web-page, but often it requires sending an email to the author asking them how they want it cited. The effort that this requires means that many people don’t bother to cite the software they use, and thus the authors don’t get the credit that they need. We need to change this, so that software – which underlies a huge amount of important scientific work – gets the recognition it deserves.

As with many things relating to software sustainability in science, the R project does this very well: if you want to find out how to cite the R software itself you simply run the command:
1
citation()
If you want to find out how to cite a package you simply run:
1
citation(PROJECTNAME)
For example:
1
2
3
4
5
6
7
8
9
10
11
12
13
> citation('ggplot2')
To cite ggplot2 in publications, please use:
  H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York,
  2009.
A BibTeX entry for LaTeX users is
  @Book{,
    author = {Hadley Wickham},
    title = {ggplot2: elegant graphics for data analysis},
    publisher = {Springer New York},
    year = {2009},
    isbn = {978-0-387-98140-6},
    url = {http://had.co.nz/ggplot2/book},
  }
In this case the citation was given by the author of the package, in R code, in a file called (surprise, surprise) CITATION inside the package directory. R can even intelligently make up a citation if the author hasn’t provided one (and will intelligently do this far better if you use the person class in your description). Note also that the function provides a nice handy BibTeX entry for those who use LaTeX – making it even easier to use the citation, and thus reducing the effort involved in citing software properly.

How to add the reference to your package or function in R:

An installed file named CITATION will be used by the citation() function. The important tips is that To be installed, it needed to be in the inst subdirectory of the package sources.

The CITATION file is parsed as R code (in the package’s declared encoding, or in ASCII if none is declared). If no such file is present, citation auto-generates citation information from the package DESCRIPTION metadata, and an example of what that would look like as a CITATION file can be seen in recommended package nlme (see below): recommended packages boot, cluster and mgcv have further examples.

A CITATION file will contain calls to function bibentry.

Here is that for nlme:


1
2
3
4
5
6
7
8
9
10
11
12
13
year <- sub(“-.*”, “”, meta$Date) note <- sprintf(“R package version %s”, meta$Version) bibentry(bibtype = “Manual”, title = “{nlme}: Linear and Nonlinear Mixed Effects Models”, author = c(person(“Jose”, “Pinheiro”), person(“Douglas”, “Bates”), person(“Saikat”, “DebRoy”), person(“Deepayan”, “Sarkar”), person(“R Core Team”)), year = year, note = note, url = “http://CRAN.R-project.org/package=nlme”)

Note the way that information that may need to be updated is picked up from the DESCRIPTION file – it is tempting to hardcode such information, but it normally then gets outdated. 

In case a bibentry contains LaTeX markup (e.g., for accented characters or mathematical symbols), it may be necessary to provide a text representation to be used for printing via the textVersion argument to bibentry. E.g., earlier versions of nlme additionally used

1
2
3
4
5
6
textVersion = paste0(“Jose Pinheiro, Douglas Bates, Saikat DebRoy,”, “Deepayan Sarkar and the R Core Team (“, year, “). nlme: Linear and Nonlinear Mixed Effects Models. “, note, “.”)


The CITATION file should itself produce no output when source-d.

Good luck!!







To leave a comment for the author, please follow the link and comment on their blog: BioCode's Notes.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.