Dependencies of popular R packages

July 8, 2014
By

(This article was first published on Revolutions, and kindly contributed to R-bloggers)

With the growing popularity of R, there is an associated increase in the popularity of online forums to ask questions. One of the most popular sites is StackOverflow, where more than 60 thousand questions have been asked and tagged to be related to R.

On the same page, you can also find related tags. Among the top 15 tags associated with R, several are also packages you can find on CRAN:

  • ggplot2
  • data.table
  • plyr
  • knitr
  • shiny
  • xts
  • lattice

It very easy to install these packages directly from CRAN using the R function install.packages(), but this will also install all these package dependencies.

This leads to the question: How can one determine all these dependencies?

It is possible to do this using the function available.packages() and then query the resulting object.

But it is easier to answer this question using the functions in a new package, called miniCRAN, that I am working on. I have designed miniCRAN to allow you to create a mini version of CRAN behind a corporate firewall. You can use some of the function in miniCRAN to list packages and their dependencies, in particular:

  • pkgAvail()
  • pkgDep()
  • makeDepGraph()

I illustrate these functions in the following scripts.

Start by loading miniCRAN and retrieving the available packages on CRAN. Use the function pkgAvail() to do this:

library(miniCRAN)
pkgdata <- pkgAvail(repos = c(CRAN="http://cran.revolutionanalytics.com"), 
                    type="source")
head(pkgdata[, c("Depends", "Suggests")])
##             Depends                                  Suggests             
## A3          "R (>= 2.15.0), xtable, pbapply"         "randomForest, e1071"
## abc         "R (>= 2.10), nnet, quantreg, MASS"      NA                   
## abcdeFBA    "Rglpk,rgl,corrplot,lattice,R (>= 2.10)" "LIM,sybil"          
## ABCExtremes "SpatialExtremes, combinat"              NA                   
## ABCoptim    NA                                       NA                   
## ABCp2       "MASS"                                   NA

 

Next, use the function pkgDep() to get dependencies of the 7 popular tags on StackOverflow:

tags <- c("ggplot2", "data.table", "plyr", "knitr", 
          "shiny", "xts", "lattice")
pkgList <- pkgDep(tags, availPkgs=pkgdata, suggests=TRUE)
pkgList
##  [1] "abind"        "bit64"        "bitops"       "Cairo"       
##  [5] "caTools"      "chron"        "codetools"    "colorspace"  
##  [9] "data.table"   "dichromat"    "digest"       "evaluate"    
## [13] "fastmatch"    "foreach"      "formatR"      "fts"         
## [17] "ggplot2"      "gtable"       "hexbin"       "highr"       
## [21] "Hmisc"        "htmltools"    "httpuv"       "iterators"   
## [25] "itertools"    "its"          "KernSmooth"   "knitr"       
## [29] "labeling"     "lattice"      "mapproj"      "maps"        
## [33] "maptools"     "markdown"     "MASS"         "mgcv"        
## [37] "mime"         "multcomp"     "munsell"      "nlme"        
## [41] "plyr"         "proto"        "quantreg"     "RColorBrewer"
## [45] "Rcpp"         "RCurl"        "reshape"      "reshape2"    
## [49] "rgl"          "RJSONIO"      "scales"       "shiny"       
## [53] "stringr"      "testit"       "testthat"     "timeDate"    
## [57] "timeSeries"   "tis"          "tseries"      "XML"         
## [61] "xtable"       "xts"          "zoo"

 

Wow, look how these 7 packages have dependencies on 63 other packages!

You can graphically visualise these dependencies in a graph, by using the function makeDepGraph():

p <- makeDepGraph(pkgList, availPkgs=pkgdata)
library(igraph)
 
plotColours <- c("grey80", "orange")
topLevel <- as.numeric(V(p)$name %in% tags)
 
par(mai=rep(0.25, 4))
 
set.seed(50)
vColor <- plotColours[1 + topLevel]
plot(p, vertex.size=8, edge.arrow.size=0.5, 
     vertex.label.cex=0.7, vertex.label.color="black", 
     vertex.color=vColor)
legend(x=0.9, y=-0.9, legend=c("Dependencies", "Initial list"), 
       col=c(plotColours, NA), pch=19, cex=0.9)
text(0.9, -0.75, expression(xts %->% zoo), adj=0, cex=0.9)
text(0.9, -0.8, "xts depends on zoo", adj=0, cex=0.9)
title("Package dependency graph")


Dep-graph

So, if you wanted to install the 7 most popular packages R packages (according to StackOverflow), R will in fact download and install up to 63 different packages!

To leave a comment for the author, please follow the link and comment on his blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.