Working in CRAN’s World

[This article was first published on R – Win Vector LLC, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Part of the deal of having a package up on CRAN is: at any time one may be sent an automated email like the following.

Dear maintainer,

Please see the problems shown on
URL.

Please correct before TODAY+14DAYS to safely retain your package on CRAN.

The CRAN Team

If this automated email from a bulk sender bounces, goes to SPAM, or isn’t responded to quickly: your package will be archived or removed from CRAN. We’ve received these emails, and always acted on them quickly, out of fear.

The referred to check results are often not reproducible. For example, our most recent scare (that hasn’t yet triggered the email, and we have submitted a work-around before complaining here) was just “SUMMARY: processing the following file failed”, without details beyond the name of the failing file.

This process is what enforces a lower bound on CRAN package quality. Packages at least run their own examples and tests without failure on the current version of R and current version of dependent packages. One must appreciate: even maintaining this technical level of quality is so hard that many repository systems do not attempt even this. All parties deeply understand we are not checking correctness, merely the absence of obvious incorrectness. The fact that maintaining even this can be hard is humbling.

In practice, the policy can be very demoralizing for package maintainers. This is how I feel as a package maintainer, and has been what I have heard from a number of package maintainers I respect.

Common triggers of the above removal include any and all of:

  • An external URL in your documentation becoming stale, being temporarily unavailable, or even being redirected.
  • The next version of R changing an API or behavior.
  • Any package you depend on or suggest changing its API or behavior in a test, example, or vignette used by your package.
  • Any package you depend on, or even suggest, having its own error on CRAN.
  • Any package you depend on, or even suggest, having itself been removed from CRAN.
  • Faults in the CRAN build infrastructure itself.
  • CRAN adding a new machine architecture to the checks.
  • CRAN’s check facility not having required dependent tools installed.
  • Errors in your own package.
  • Use of external resources by your package.
  • Too many tests in your package.

To fix a failing package one must at least (while on the clock):

  • Reproduce the problem. Possibly with versions and machines not available to you, rhub, or CRAN’s public check service.
  • Fix or patch around the issue.
  • Re-check on multiple architectures and versions of R. This is even for packages that are “pure R” and should benefit from R itself claiming stable semantics across different architectures.
  • Re-check dependencies, so they don’t get broken.
  • Fix any unrelated issues the re-check exposes. This includes version changes on underlying packages.
  • Not incorporate too many non-fix related changes or improvements.
  • Re-submit the package and track if it is progressing through he CRAN queue, or is stuck.

As with most power-imbalanced situations, the weaker partner starts developing humiliating coping behaviors. For example: over-checking CRAN status when one does have free time and deleting risky (hence important!) test and documentation. And there is always the sneaking suspicion that CRAN enforcement may not actually apply to any number of “too big to fail” packages (obviously, that may just be stress or paranoia).

A natural question is: why complain? One agreed to this as a submitter, and all package submitters are subject to this.

I’d say, I didn’t so much agree to it as submit to the requirement for the benefit of having our packages distributed by CRAN. A package being on CRAN is seen as a minimum requirement for many users. So not being up on CRAN can be considered damning.

I actually feel very sick and conflicted criticizing CRAN, as they are in my opinion/experience one of the more non-partisan organizations in the R ecosystem. That isn’t to say things are good, but it is easy to see worse alternatives. It is hard to say if one should worry about the fire from the relative comfort of the frying pan. I feel I am shooting myself in the foot complaining about CRAN, as I do not want CRAN undermined. However, silence is painful. I did not enjoy writing this note.

From a community viewpoint, there are some things I like to see. I would like to see more engagement from CRAN and R-core (two different organizations) with the larger community. I subscribe to, and contribute to, the R mailing lists. However, I don’t see a lot of solicitation of opinions or RFC (request for comment) style interaction. I get “CRAN is volunteers”. However, some of us are also volunteers.

On the technical side: a lot of the problem is confounding different semantics and severity of errors. The meaning of a test failure is different than the meaning of a documentation glitch.

But, as I said, nobody asked me.

To leave a comment for the author, please follow the link and comment on their blog: R – Win Vector LLC.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)