Follow-up to Counting CRAN Package Depends, Imports and LinkingTo

August 16, 2012

(This article was first published on Thinking inside the box , and kindly contributed to R-bloggers)

A few days ago, I
blogged about visualizing CRAN dependency ranks
which turned out to be a somewhat popular post. David Smith
followed-up at the REvo blog
suggesting to exclude packages already shipping with R (which is indicated by
their ‘Recommended’ priority). Good idea!

So here is an updated version, where we limit the display to the top twenty
packages counted by reverse ‘Depends:’, and excluding those already shipping
with R such as MASS,
Matrix, or

alt="CRAN package chart of Reverse Depends relations excluding Recommended packages">

The mvtnorm package
is still out by a wide margin, but we can note that (cough, cough) our
Rcpp package for
seamless R and C++ is now tied for second with the
coda package for MCMC analysis.
Also of note is the fact that CRAN keeps growing relentlessly and moved from
3969 packages to 3981 packages in the space of these few days…

Lastly, I have been asked about the code and/or data behind this. It is
really pretty simply as the main data.frame can be had from CRAN
(where I also found the initial few lines to load it). After that, one only
needs a little bit of subsetting as shown below. I look forward to seeing
other people riff on this data set.

## Initial db downloand from and adapted


## this function is essentially the same as R Core's from the URL
getDB <- function() {
    contrib.url(getOption("repos")["CRAN"], "source") # trigger chooseCRANmirror() if required
    description <- sprintf("%s/web/packages/packages.rds", getOption("repos")["CRAN"])
    con <- if(substring(description, 1L, 7L) == "file://") {
        file(description, "rb")
    } else {
        url(description, "rb")
    db <- readRDS(gzcon(con))
    rownames(db) <- db[,"Package"]


db <- getDB()

## count packages
getCounts <- function(db, col) {
    foo <- sapply(db[,col],
                  function(s) { if ( NA else length(strsplit(s, ",")[[1]]) } )

## build a data.frame with the number of entries for reverse depends, reverse imports,
## reverse linkingto and reverse suggests; also keep Recommended status
ddall <- data.frame(pkg=db[,1],
                    RDepends=getCounts(db, "Reverse depends"),
                    RImports=getCounts(db, "Reverse imports"),
                    RLinkingTo=getCounts(db, "Reverse linking to"),
                    RSuggests=getCounts(db, "Reverse suggests"),

## Subset to non-Recommended packages as in David Smith's follow-up post
dd <- subset(ddall,[,"Recommended"]) | ddall[,"Recommended"] != TRUE)

labeltxt <- paste("Analysis as of", format(Sys.Date(), "%d %b %Y"),
                  "covering", nrow(db), "total CRAN packages")

cutOff <- 20

if (doPNG) png("/tmp/CRAN_ReverseDepends.png", width=600, heigh=600)
z <- dd[head(order(dd[,2], decreasing=TRUE), cutOff),c(1,2)]
dotchart(z[,2], labels=z[,1], cex=1, pch=19,
         main="CRAN Packages sorted by Reverse Depends:",
         sub=paste("Limited to top", cutOff, "packages, excluding 'Recommended' ones shipped with R"),
if (doPNG)

if (doPNG) png("/tmp/CRAN_ReverseImports.png", width=600, heigh=600)
z <- dd[head(order(dd[,3], decreasing=TRUE), cutOff),c(1,3)]
dotchart(z[,2], labels=z[,1], cex=1, pch=19,
         main="CRAN Packages sorted by Reverse Imports:",
         sub=paste("Limited to top", cutOff, "packages, excluding 'Recommended' ones shipped with R"),
if (doPNG)

# no cutOff but rather a na.omit
if (doPNG) png("/tmp/CRAN_ReverseLinkingTo.png", width=600, heigh=600)
z <- na.omit(dd[head(order(dd[,4], decreasing=TRUE), 30),c(1,4)])
dotchart(z[,2], labels=z[,1], pch=19,
         main="CRAN Packages sorted by Reverse LinkingTo:",
if (doPNG)

To leave a comment for the author, please follow the link and comment on his blog: Thinking inside the box . offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.