Citing R or SAS

July 2, 2012
By

(This article was first published on SAS and R, and kindly contributed to R-bloggers)

One of us recently read a colleague's first draft of a paper, in which she had written: "All analyses were done in R 2.14.0." We assume we're preaching to the converted here, when we say that the enormous amount of work that goes into R needs to be recognized as often as possible, and that R's creators deserve to reap some credit for their labors. In contrast to SAS, after all, most work on R is not compensated with a paycheck. As a reminder, the citation() function produces the correct citation for R in general and is good to use when citing R.

The project in question had used a negative binomial regression function from the MASS package, but colleague had omitted any reference to it. In this case, a citation would provide both credit to the authors and a useful guide to anyone wanting to replicate our approach. It would also allow readers to consider whether changes in the package might affect the results observed. A call to citation(package="MASS") will provide the preferred citation here. (Any package name can be inserted, or course, though some authors may not have provided a full citation.)

Similarly, while SAS authors are rarely identified by name and presumably get a salary from SAS, it's preferable to identify the version of the software and where it can be obtained. In medical research this is usually done by an in-text reference. For example: "Analyses were performed in SAS 9.3 (SAS Institute, Cary NC)."

For complex analyses, it is also best to mention the SAS procedure used. As with the R package, this can help readers plan similar analyses, and may inform interpretation.

So a multi-software analysis section might end with the following statement: Analyses were performed in R 2.14.2 [1] using the MASS package [2] glm.nb() function for negative binomial regression and in SAS 9.3 (SAS Institute, Cary NC) using the MCMC procedure for negative binomial mixture models." The references to [1] and [2] would be found using the citation() function.

An unrelated note about aggregators:We love aggregators! Aggregators collect blogs that have similar coverage for the convenience of readers, and for blog authors they offer a way to reach new audiences. SAS and R is aggregated by R-bloggers, PROC-X, and statsblogs with our permission, and by at least 2 other aggregating services which have never contacted us. If you read this on an aggregator that does not credit the blogs it incorporates, please come visit us at SAS and R. We answer comments there and offer direct subscriptions if you like our content. In addition, no one is allowed to profit by this work under our license; if you see advertisements on this page, the aggregator is violating the terms by which we publish our work.

To leave a comment for the author, please follow the link and comment on his blog: SAS and R.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.