Dependencies of popular R packages

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

With the growing popularity of R, there is an associated increase in the popularity of online forums to ask questions. One of the most popular sites is StackOverflow, where more than 60 thousand questions have been asked and tagged to be related to R.

On the same page, you can also find related tags. Among the top 15 tags associated with R, several are also packages you can find on CRAN:

  • ggplot2
  • data.table
  • plyr
  • knitr
  • shiny
  • xts
  • lattice

It very easy to install these packages directly from CRAN using the R function install.packages(), but this will also install all these package dependencies.

This leads to the question: How can one determine all these dependencies?

It is possible to do this using the function available.packages() and then query the resulting object.

But it is easier to answer this question using the functions in a new package, called miniCRAN, that I am working on. I have designed miniCRAN to allow you to create a mini version of CRAN behind a corporate firewall. You can use some of the function in miniCRAN to list packages and their dependencies, in particular:

  • pkgAvail()
  • pkgDep()
  • makeDepGraph()

I illustrate these functions in the following scripts.

Start by loading miniCRAN and retrieving the available packages on CRAN. Use the function pkgAvail() to do this:

library(miniCRAN)
pkgdata <- pkgAvail(repos = c(CRAN="http://cran.revolutionanalytics.com"), 
                    type="source")
head(pkgdata[, c("Depends", "Suggests")])
##             Depends                                  Suggests             
## A3          "R (>= 2.15.0), xtable, pbapply"         "randomForest, e1071"
## abc         "R (>= 2.10), nnet, quantreg, MASS"      NA                   
## abcdeFBA    "Rglpk,rgl,corrplot,lattice,R (>= 2.10)" "LIM,sybil"          
## ABCExtremes "SpatialExtremes, combinat"              NA                   
## ABCoptim    NA                                       NA                   
## ABCp2       "MASS"                                   NA

 

Next, use the function pkgDep() to get dependencies of the 7 popular tags on StackOverflow:

tags <- c("ggplot2", "data.table", "plyr", "knitr", 
          "shiny", "xts", "lattice")
pkgList <- pkgDep(tags, availPkgs=pkgdata, suggests=TRUE)
pkgList
##  [1] "abind"        "bit64"        "bitops"       "Cairo"       
##  [5] "caTools"      "chron"        "codetools"    "colorspace"  
##  [9] "data.table"   "dichromat"    "digest"       "evaluate"    
## [13] "fastmatch"    "foreach"      "formatR"      "fts"         
## [17] "ggplot2"      "gtable"       "hexbin"       "highr"       
## [21] "Hmisc"        "htmltools"    "httpuv"       "iterators"   
## [25] "itertools"    "its"          "KernSmooth"   "knitr"       
## [29] "labeling"     "lattice"      "mapproj"      "maps"        
## [33] "maptools"     "markdown"     "MASS"         "mgcv"        
## [37] "mime"         "multcomp"     "munsell"      "nlme"        
## [41] "plyr"         "proto"        "quantreg"     "RColorBrewer"
## [45] "Rcpp"         "RCurl"        "reshape"      "reshape2"    
## [49] "rgl"          "RJSONIO"      "scales"       "shiny"       
## [53] "stringr"      "testit"       "testthat"     "timeDate"    
## [57] "timeSeries"   "tis"          "tseries"      "XML"         
## [61] "xtable"       "xts"          "zoo"

 

Wow, look how these 7 packages have dependencies on 63 other packages!

You can graphically visualise these dependencies in a graph, by using the function makeDepGraph():

p <- makeDepGraph(pkgList, availPkgs=pkgdata)
library(igraph)
 
plotColours <- c("grey80", "orange")
topLevel <- as.numeric(V(p)$name %in% tags)
 
par(mai=rep(0.25, 4))
 
set.seed(50)
vColor <- plotColours[1 + topLevel]
plot(p, vertex.size=8, edge.arrow.size=0.5, 
     vertex.label.cex=0.7, vertex.label.color="black", 
     vertex.color=vColor)
legend(x=0.9, y=-0.9, legend=c("Dependencies", "Initial list"), 
       col=c(plotColours, NA), pch=19, cex=0.9)
text(0.9, -0.75, expression(xts %->% zoo), adj=0, cex=0.9)
text(0.9, -0.8, "xts depends on zoo", adj=0, cex=0.9)
title("Package dependency graph")


Dep-graph

So, if you wanted to install the 7 most popular packages R packages (according to StackOverflow), R will in fact download and install up to 63 different packages!

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)