Follow-up to Counting CRAN Package Depends, Imports and LinkingTo
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A few days ago, I
blogged about visualizing CRAN dependency ranks
which turned out to be a somewhat popular post. David Smith
followed-up at the REvo blog
suggesting to exclude packages already shipping with R (which is indicated by
their ‘Recommended’ priority). Good idea!
So here is an updated version, where we limit the display to the top twenty
packages counted by reverse ‘Depends:’, and excluding those already shipping
with R such as MASS,
lattice,
survival,
Matrix, or
nlme.

The mvtnorm package
is still out by a wide margin, but we can note that (cough, cough) our
Rcpp package for
seamless R and C++ is now tied for second with the
coda package for MCMC analysis.
Also of note is the fact that CRAN keeps growing relentlessly and moved from
3969 packages to 3981 packages in the space of these few days…
Lastly, I have been asked about the code and/or data behind this. It is
really pretty simply as the main data.frame
can be had from CRAN
(where I also found the initial few lines to load it). After that, one only
needs a little bit of subsetting as shown below. I look forward to seeing
other people riff on this data set.
#!/usr/bin/r ## ## Initial db downloand from http://developer.r-project.org/CRAN/Scripts/depends.R and adapted require("tools") ## this function is essentially the same as R Core's from the URL ## http://developer.r-project.org/CRAN/Scripts/depends.R getDB <- function() { contrib.url(getOption("repos")["CRAN"], "source") # trigger chooseCRANmirror() if required description <- sprintf("%s/web/packages/packages.rds", getOption("repos")["CRAN"]) con <- if(substring(description, 1L, 7L) == "file://") { file(description, "rb") } else { url(description, "rb") } on.exit(close(con)) db <- readRDS(gzcon(con)) rownames(db) <- db[,"Package"] db } db <- getDB() ## count packages getCounts <- function(db, col) { foo <- sapply(db[,col], function(s) { if (is.na(s)) NA else length(strsplit(s, ",")[[1]]) } ) } ## build a data.frame with the number of entries for reverse depends, reverse imports, ## reverse linkingto and reverse suggests; also keep Recommended status ddall <- data.frame(pkg=db[,1], RDepends=getCounts(db, "Reverse depends"), RImports=getCounts(db, "Reverse imports"), RLinkingTo=getCounts(db, "Reverse linking to"), RSuggests=getCounts(db, "Reverse suggests"), Recommended=db[,"Priority"]=="recommended" ) ## Subset to non-Recommended packages as in David Smith's follow-up post dd <- subset(ddall, is.na(ddall[,"Recommended"]) | ddall[,"Recommended"] != TRUE) labeltxt <- paste("Analysis as of", format(Sys.Date(), "%d %b %Y"), "covering", nrow(db), "total CRAN packages") cutOff <- 20 doPNG <- TRUE if (doPNG) png("/tmp/CRAN_ReverseDepends.png", width=600, heigh=600) z <- dd[head(order(dd[,2], decreasing=TRUE), cutOff),c(1,2)] dotchart(z[,2], labels=z[,1], cex=1, pch=19, main="CRAN Packages sorted by Reverse Depends:", sub=paste("Limited to top", cutOff, "packages, excluding 'Recommended' ones shipped with R"), xlab=labeltxt) if (doPNG) dev.off() if (doPNG) png("/tmp/CRAN_ReverseImports.png", width=600, heigh=600) z <- dd[head(order(dd[,3], decreasing=TRUE), cutOff),c(1,3)] dotchart(z[,2], labels=z[,1], cex=1, pch=19, main="CRAN Packages sorted by Reverse Imports:", sub=paste("Limited to top", cutOff, "packages, excluding 'Recommended' ones shipped with R"), xlab=labeltxt) if (doPNG) dev.off() # no cutOff but rather a na.omit if (doPNG) png("/tmp/CRAN_ReverseLinkingTo.png", width=600, heigh=600) z <- na.omit(dd[head(order(dd[,4], decreasing=TRUE), 30),c(1,4)]) dotchart(z[,2], labels=z[,1], pch=19, main="CRAN Packages sorted by Reverse LinkingTo:", xlab=labeltxt) if (doPNG) dev.off()
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.