Writing papers about packages

[This article was first published on R on msperlin, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Back in 2007 I wrote a Matlab package for estimating regime switching models. I was just starting to learn to code and this project was my way of doing it. After publishing it in FEX (Matlab file exchange site), I got so many repeated questions on my email that eventually I realized it would be easier to write a manual for people to read. Some time and effort would be spend writing it, but less time replying to repeated questions on my email.

This manual about the code became, by far, my most cited paper in Google Scholar. It is not even published, just a permanent working paper. When attending conferences and seminars, I was always surprised to hear that, at that time, people knew me as the matlab regime switching guy.

Moving forward a few years, I stopped using Matlab for R and I continue to invest a lot of time writing papers about packages and publishing them in standard scientific journals. I can testify for a greater contribution and impact for research papers about code. I strongly believe that this format will become more popular in the years to come. The new generation of researchers is far more aware of code than the previous. In that sense, nothing beats R and CRAN at the diversity and depth of packages.

In this subject, I frequently review papers in the same topic and I see common mistakes that researchers do when writing their papers. Here’s some tips for those that wish to pursue such a type of publication:

  • A problem must be clearly stated: Every paper is a solution to a problem. This is also true for a paper about code. Identify it and make it painfully clear how the code solves it. Simply put, do your homework.

  • The paper is NOT an extended manual: Don’t write a paper simply showing its functions. We have that from CRAN or other repository of code.

  • Make sure you know what’s available: How people did it before? Is there a competing package? How does your code improves it?

  • A bibliometric study is mandatory: Same as the previous point. Looking at the previous published research papers, can you find out how they handled the problem your code solves?

  • Not everyone uses R, so make it easier for people to use you software: Make sure you keep a simple and accessible code. Explain what is R and why you should use it. Case in point, not everyone know what a tibble is.

  • Think about your example of usage: You should always add a reproducible example of usage. This is what everyone will try! Make sure it is a simple example, not too deep in the literature. Something everyone can understand. Your code should also be accessible and reproducible.

It is a lot of work to publish a research paper about code. But, it is all worth it! The impact is much greater than a standard research paper. Your academic career will certainly move forward with it.

To leave a comment for the author, please follow the link and comment on their blog: R on msperlin.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)