Installing Package Dependencies without external http(s) requests

February 24, 2018
By

(This article was first published on Peter E. DeWitt, and kindly contributed to R-bloggers)

Consider you have a server that is running behind a firewall and, for security
reasons, cannot make external http(s) requests. Further, you have R running on
this server and you need to install a set of packages. The simple approach of

install.packages("", repo = "")

is not an option since you will have no access to the CRAN repository.

Another option would be to download the source files (.tar.gz files) form CRAN
or BioConductor, transfer those files to the sever via FTP, and then install the
packages via

install.packages("

This approach will work well, with one big exception, the dependencies of the
package may not be on the server. How do you get the source files for all the
dependencies of your package? What about the dependencies of the dependencies,
and the dependencies of the dependencies of the dependencies? Simply, how do
you install R packages on a machine that is not allowed to make external http(s)
requests?

Here is how I approached this problem. On my local machine, a machine with
internet access, I ran a script (a script that will be shown and explained in
detail below) which will download all the dependencies and dependencies of
dependencies, etc., from both CRAN and BioConductor, and generate a makefile
to install the packages in the correct order, i.e., in an order such that the
dependencies are met.

When the script finished, the source files and the makefile can be transfered
to the server without external http(s) request authority. Running the
makefile will install the packages, and is an easy way to track and report
install errors.

We need to define what constitutes a dependency. In a package DESCRIPTION
file packages listed under the field Depends, Imports, and LinkingTo are
what we will consider dependencies. Suggests and Enhances are omitted as
they are not needed for the package to work.

Build Dependencies

An R script build-dep-list.R has been written and is expected to be evaluated
from the command line via

Rscript --vanilla build-dep-list.R [pkg1] [pkg2] [...] [pkgn]

Where pkg1 is the name of the first known package to download, pkg2 the
second known package to download, …, and pkgn the nth package to download.
The script will download all the dependencies for pkg1, ..., pkgn, and the
dependencies of the dependencies, and so on. The script will also generate a
makefile to help with the installation of the packages, aiming to get the
order of the installs so that the install of pkg1, ..., pkgn will not
error.

The full script can be found on my
github page. The script
will be broken up into pieces here with additional detail and explanation.

When I develop scripts that I expect to evaluated in the terminal, I will start
the script with a check of interactive(). If in an interactive session we’ll
have set variables to values needed for testing and development, and if not in
an interactive session we’ll use the command line arguments to define the value
of the variables. This could also be edited so that the expected evaluation
would be done in an interactive session. For then work we will have the
character vector OUR_PACKAGES to store the names of the packages we
want/need to install.

if (interactive()) {
  OUR_PACKAGES <- c("graph", "gRbase", "gRain", "jsonlite", "plotly", "SHELF",
                    "rjson", "svglite", "magrittr")
} else {
  OUR_PACKAGES <- commandArgs(trailingOnly = TRUE)
}

We also need to define the repositories which we will query for the packages.
We’ll use RStudio’s CRAN mirror and the repository for BioConductor.

# Repositories to look for packages
CRAN <- "https://cran.rstudio.com/"
BIOC <- "https://bioconductor.org/packages/release/bioc/"

Now, let’s look into the packages. Packages are classified into three priority
classes, “base”, “recommended”, and “NA”. The “base” packages are standard an R
installation, and the ‘recommended’ are in any standard installation of R. All
other packages have Priority == NA.

ipkgs <- utils::installed.packages()
ipkgs[ipkgs[, "Priority"] %in% "base", "Package"]
##        base    compiler    datasets    graphics   grDevices        grid 
##      "base"  "compiler"  "datasets"  "graphics" "grDevices"      "grid" 
##     methods    parallel     splines       stats      stats4       tcltk 
##   "methods"  "parallel"   "splines"     "stats"    "stats4"     "tcltk" 
##       tools       utils 
##     "tools"     "utils"
ipkgs[ipkgs[, "Priority"] %in% "recommended", "Package"]
##         boot        class      cluster    codetools      foreign 
##       "boot"      "class"    "cluster"  "codetools"    "foreign" 
##   KernSmooth      lattice         MASS       Matrix         mgcv 
## "KernSmooth"    "lattice"       "MASS"     "Matrix"       "mgcv" 
##         nnet        rpart      spatial     survival 
##       "nnet"      "rpart"    "spatial"   "survival"

Some packages will have dependencies on the “base” and/or “recommended”
packages. We will need to know these packages and omit them form the packages
we will need to download and install.

base_pkgs <-
  unname(utils::installed.packages()[utils::installed.packages()[, "Priority"] %in% c("base", "recommended"), "Package"])

Next step, get a list of the available packages from CRAN and BioConductor. The
return from available.packages is a matrix with all the information we will
need about the packages.

available_pkgs <- available.packages(repos = c(CRAN, BIOC))
str(available_pkgs)
##  chr [1:13659, 1:17] "A3" "abbyyR" "abc" "abc.data" "ABC.RAP" ...
##  - attr(*, "dimnames")=List of 2
##   ..$ : chr [1:13659] "A3" "abbyyR" "abc" "abc.data" ...
##   ..$ : chr [1:17] "Package" "Version" "Priority" "Depends" ...

available_pkgs[available_pkgs[, "Package"] %in% OUR_PACKAGES,
               c("Package", "Version", "Depends", "Imports", "LinkingTo",
                 "Repository")]
##          Package    Version 
## gRain    "gRain"    "1.3-0" 
## gRbase   "gRbase"   "1.8-3" 
## jsonlite "jsonlite" "1.5"   
## magrittr "magrittr" "1.5"   
## plotly   "plotly"   "4.7.1" 
## rjson    "rjson"    "0.2.15"
## SHELF    "SHELF"    "1.3.0" 
## svglite  "svglite"  "1.2.1" 
## graph    "graph"    "1.56.0"
##          Depends                                          
## gRain    "R (>= 3.0.2), methods, gRbase (>= 1.7-2)"       
## gRbase   "R (>= 3.0.2), methods"                          
## jsonlite "methods"                                        
## magrittr NA                                               
## plotly   "R (>= 3.2.0), ggplot2 (>= 2.2.1)"               
## rjson    "R (>= 3.1.0)"                                   
## SHELF    "R (>= 3.3.1)"                                   
## svglite  "R (>= 3.0.0)"                                   
## graph    "R (>= 2.10), methods, BiocGenerics (>= 0.13.11)"
##          Imports                                                                                                                                                                                                     
## gRain    "igraph, graph, magrittr, functional, Rcpp (>= 0.11.1)"                                                                                                                                                     
## gRbase   "graph, igraph, magrittr, Matrix, RBGL, Rcpp (>= 0.11.1)"                                                                                                                                                   
## jsonlite NA                                                                                                                                                                                                          
## magrittr NA                                                                                                                                                                                                          
## plotly   "tools, scales, httr, jsonlite, magrittr, digest, viridisLite,\nbase64enc, htmltools, htmlwidgets (>= 0.9), tidyr, hexbin,\nRColorBrewer, dplyr, tibble, lazyeval (>= 0.2.0), crosstalk,\npurrr, data.table"
## rjson    NA                                                                                                                                                                                                          
## SHELF    "ggplot2, grid, shiny, stats, graphics, tidyr, MASS, ggExtra"                                                                                                                                               
## svglite  "Rcpp, gdtools (>= 0.1.6)"                                                                                                                                                                                  
## graph    "stats, stats4, utils"                                                                                                                                                                                      
##          LinkingTo                                                       
## gRain    "Rcpp (>= 0.11.1), RcppArmadillo, RcppEigen, gRbase (>=\n1.8-0)"
## gRbase   "Rcpp (>= 0.11.1), RcppArmadillo, RcppEigen"                    
## jsonlite NA                                                              
## magrittr NA                                                              
## plotly   NA                                                              
## rjson    NA                                                              
## SHELF    NA                                                              
## svglite  "Rcpp, gdtools, BH"                                             
## graph    NA                                                              
##          Repository                                                  
## gRain    "https://cran.rstudio.com/src/contrib"                      
## gRbase   "https://cran.rstudio.com/src/contrib"                      
## jsonlite "https://cran.rstudio.com/src/contrib"                      
## magrittr "https://cran.rstudio.com/src/contrib"                      
## plotly   "https://cran.rstudio.com/src/contrib"                      
## rjson    "https://cran.rstudio.com/src/contrib"                      
## SHELF    "https://cran.rstudio.com/src/contrib"                      
## svglite  "https://cran.rstudio.com/src/contrib"                      
## graph    "https://bioconductor.org/packages/release/bioc/src/contrib"

In this example we see that the packages listed in OUR_PACKAGES except the
graph package can be downloaded from CRAN. graph and at least one
dependencies, BiocGenerics will need to be downloaded from BioConductor.

The next step in building the list of dependencies and a script for installing
them is done in the following while loop. We start with a character vector
pkgs_to_download which is initially equivalent to OUR_PACKAGES. We will
iterate through this vector, appending the dependencies in order.
Use the tools::package_dependencies function to generate a list of the
packages dependencies, and dependencies of dependencies, and so on.

In the while loop we get a list of the dependencies for a package, stored in
the deps object. We will omit any of the base and recommended packages from
the deps object and then append deps to the pkgs_to_download vector in the
position immediately to the right of the current package being looked up. When
the indexer i is incremented, the next package to be considered will be the
first dependency. This process continues until all the dependencies have been
explored. Lastly, we reverse the order of the elements of pkgs_to_download
so that we have the packages listed in a useful install order, i.e.,
pkgs_to_download[1] should be installed before pkgs_to_download[2], etc.
After reversing the order of the elements of pkgs_to_download we look only at
the unique elements. By default, the first occurrence of an element will be
keep and the repeated elements will be omitted. By reversing the order then
taking the unique values, the deepest level of dependency will be retained for a
specific package.

pkgs_to_download <- OUR_PACKAGES
i <- 1L
while(i <= length(pkgs_to_download)) {
  deps <-
    unlist(tools::package_dependencies(packages = pkgs_to_download[i],
                                       which = c("Depends", "Imports", "LinkingTo"),
                                       db = available_pkgs,
                                       recursive = FALSE),
           use.names = FALSE) 
  deps <- deps[!(deps %in% base_pkgs)]
  pkgs_to_download <- append(pkgs_to_download, deps, i) 
  i <- i + 1L
}
pkgs_to_download <- unique(rev(pkgs_to_download))

If you are having a difficult time envisioning what the above does, let’s look
at and example for the dplyr package. In this example we’ll print out the
list of dependencies at each step through the while loop. Note that packages
such as Rcpp will be assessed multiple times, but the final list will only
have Rcpp listed once.

dplyr_dependencies <- "dplyr"
i <- 1L
while(i <= length(dplyr_dependencies)) {

  cat("\ni =", i, "\nLooking up dependencies for", dplyr_dependencies[i], "\n")
  deps <-
    unlist(tools::package_dependencies(packages = dplyr_dependencies[i],
                                       which = c("Depends", "Imports", "LinkingTo"),
                                       db = available_pkgs,
                                       recursive = FALSE),
           use.names = FALSE) 
  deps <- deps[!(deps %in% base_pkgs)]
  dplyr_dependencies <- append(dplyr_dependencies, deps, i) 
  
  cat(dplyr_dependencies[i], "dependencies:", paste(deps, collapse = ", "),
      "\ndplyr_dependencies =", paste(dplyr_dependencies, collapse = ", "), "\n")

  i <- i + 1L
}
## 
## i = 1 
## Looking up dependencies for dplyr 
## dplyr dependencies: assertthat, bindrcpp, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## dplyr_dependencies = dplyr, assertthat, bindrcpp, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 2 
## Looking up dependencies for assertthat 
## assertthat dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 3 
## Looking up dependencies for bindrcpp 
## bindrcpp dependencies: Rcpp, bindr, plogr 
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 4 
## Looking up dependencies for Rcpp 
## Rcpp dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 5 
## Looking up dependencies for bindr 
## bindr dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 6 
## Looking up dependencies for plogr 
## plogr dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 7 
## Looking up dependencies for glue 
## glue dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 8 
## Looking up dependencies for magrittr 
## magrittr dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 9 
## Looking up dependencies for pkgconfig 
## pkgconfig dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 10 
## Looking up dependencies for rlang 
## rlang dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 11 
## Looking up dependencies for R6 
## R6 dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 12 
## Looking up dependencies for Rcpp 
## Rcpp dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, BH, plogr 
## 
## i = 13 
## Looking up dependencies for tibble 
## tibble dependencies: cli, crayon, pillar, rlang 
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, crayon, pillar, rlang, BH, plogr 
## 
## i = 14 
## Looking up dependencies for cli 
## cli dependencies: assertthat, crayon 
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, rlang, BH, plogr 
## 
## i = 15 
## Looking up dependencies for assertthat 
## assertthat dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, rlang, BH, plogr 
## 
## i = 16 
## Looking up dependencies for crayon 
## crayon dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, rlang, BH, plogr 
## 
## i = 17 
## Looking up dependencies for crayon 
## crayon dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, rlang, BH, plogr 
## 
## i = 18 
## Looking up dependencies for pillar 
## pillar dependencies: cli, crayon, rlang, utf8 
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 19 
## Looking up dependencies for cli 
## cli dependencies: assertthat, crayon 
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 20 
## Looking up dependencies for assertthat 
## assertthat dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 21 
## Looking up dependencies for crayon 
## crayon dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 22 
## Looking up dependencies for crayon 
## crayon dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 23 
## Looking up dependencies for rlang 
## rlang dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 24 
## Looking up dependencies for utf8 
## utf8 dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 25 
## Looking up dependencies for rlang 
## rlang dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 26 
## Looking up dependencies for BH 
## BH dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr 
## 
## i = 27 
## Looking up dependencies for plogr 
## plogr dependencies:  
## dplyr_dependencies = dplyr, assertthat, bindrcpp, Rcpp, bindr, plogr, glue, magrittr, pkgconfig, rlang, R6, Rcpp, tibble, cli, assertthat, crayon, crayon, pillar, cli, assertthat, crayon, crayon, rlang, utf8, rlang, BH, plogr
dplyr_dependencies <- unique(rev(dplyr_dependencies))
dplyr_dependencies
##  [1] "plogr"      "BH"         "rlang"      "utf8"       "crayon"    
##  [6] "assertthat" "cli"        "pillar"     "tibble"     "Rcpp"      
## [11] "R6"         "pkgconfig"  "magrittr"   "glue"       "bindr"     
## [16] "bindrcpp"   "dplyr"

Now that we have pkgs_to_download, a character vector of package names that
we need to download, we can use the download.packages function to do so. The
object dwnld_pkgs is a 2 column matrix with the name and file path to the
source file for each package.

# Download the needed packages into the pkg-source-files directory
unlink("pkg-source-files/*")
dir.create("pkg-source-files/", showWarnings = FALSE)

dwnld_pkgs <-
  download.packages(pkgs = pkgs_to_download,
                    destdir = "pkg-source-files",
                    repos = c(CRAN, BIOC),
                    type = "source")

head(dwnld_pkgs)

The last step for the script to run on a machine with external http(s) request
authority, is to build a makefile to install all the needed packages. I
prefer the makefile over a bash script because of the default error handling
that a make provided compared to a bash script.

cat("all:\n",
    paste0("\tR CMD INSTALL ", dwnld_pkgs[, 2], "\n"),
    sep = "",
    file = "makefile") 

For this example, the first several lines of the makefile are:

all:
  R CMD INSTALL pkg-source-files/magrittr_1.5.tar.gz
  R CMD INSTALL pkg-source-files/BH_1.66.0-1.tar.gz
  R CMD INSTALL pkg-source-files/withr_2.1.1.tar.gz
  R CMD INSTALL pkg-source-files/Rcpp_0.12.15.tar.gz
  R CMD INSTALL pkg-source-files/gdtools_0.1.6.tar.gz
  R CMD INSTALL pkg-source-files/svglite_1.2.1.tar.gz

Note that magrittr is the last package in the OUR_PACKAGES object and has no
dependencies, thus is the first package installed. The svglite package is the
second to last package in OUR_PACKAGES and it will be installed after the
dependencies BH, withr, Rcpp, and gdtools are installed.

Installing on the Remote Machine

Now that the source files have been downloaded and the makefile generated,
move the pkg-source-files directory and the makefile to the remote machine
and run the makefile. If the makefile fails, there might be some system
dependencies that need to be updated.

Download the script and/or contribute

The build-dependency-list.R file can be found on my
github page.

To leave a comment for the author, please follow the link and comment on their blog: Peter E. DeWitt.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)