Welcome to the tenth post in the rarely ranting R recommendations series, or R4 for short. A few days ago we showed how to tell the linker to strip shared libraries. As discussed in the post, there are two options. One can either set up
~/.R/Makevars by passing the
strip-debug option to the linker. Alternatively, one can adjust
src/Makevars in the package itself with a bit a Makefile magic.
Of course, there is a third way: just run
strip --strip-debug over all the shared libraries after the build. As the path is standardized, and the shell does proper globbing, we can just do
$ strip --strip-debug /usr/local/lib/R/site-library/*/libs/*.so
using a double-wildcard to get all packages (in that R package directory) and all their shared libraries. Users on macOS probably want
.dylib on the end, users on Windows want another computer as usual (just kidding: use
.dll). Either may have to adjust the path which is left as an exercise to the reader.
The impact can be Yuge as illustrated in the following dotplot:
This illustration is in response to a mailing list post. Last week, someone claimed on r-help that tidyverse would not install on Ubuntu 17.04. And this is of course patently false as many of us build and test on Ubuntu and related Linux systems, Travis runs on it, CRAN tests them etc pp. That poor user had somehow messed up their default
gcc version. Anyway: I fired up a Docker container, installed
r-base-core plus three required
-dev packages (for xml2, openssl, and curl) and ran a single
install.packages("tidyverse"). In a nutshell, following the launch of Docker for an Ubuntu 17.04 container, it was just
$ apt-get update $ apt-get install r-base libcurl4-openssl-dev libssl-dev libxml2-dev $ apt-get install mg # a tiny editor $ mg /etc/R/Rprofile.site # to add a default CRAN repo $ R -e 'install.packages("tidyverse")'
which not only worked (as expected) but also installed a whopping fifty-one packages (!!) of which twenty-six contain a shared library. A useful little trick is to run
du with proper options to total, summarize, and use human units which reveals that these libraries occupy seventy-eight megabytes:
root@de443801b3fc:/# du -csh /usr/local/lib/R/site-library/*/libs/*so 4.3M /usr/local/lib/R/site-library/Rcpp/libs/Rcpp.so 2.3M /usr/local/lib/R/site-library/bindrcpp/libs/bindrcpp.so 144K /usr/local/lib/R/site-library/colorspace/libs/colorspace.so 204K /usr/local/lib/R/site-library/curl/libs/curl.so 328K /usr/local/lib/R/site-library/digest/libs/digest.so 33M /usr/local/lib/R/site-library/dplyr/libs/dplyr.so 36K /usr/local/lib/R/site-library/glue/libs/glue.so 3.2M /usr/local/lib/R/site-library/haven/libs/haven.so 272K /usr/local/lib/R/site-library/jsonlite/libs/jsonlite.so 52K /usr/local/lib/R/site-library/lazyeval/libs/lazyeval.so 64K /usr/local/lib/R/site-library/lubridate/libs/lubridate.so 16K /usr/local/lib/R/site-library/mime/libs/mime.so 124K /usr/local/lib/R/site-library/mnormt/libs/mnormt.so 372K /usr/local/lib/R/site-library/openssl/libs/openssl.so 772K /usr/local/lib/R/site-library/plyr/libs/plyr.so 92K /usr/local/lib/R/site-library/purrr/libs/purrr.so 13M /usr/local/lib/R/site-library/readr/libs/readr.so 4.7M /usr/local/lib/R/site-library/readxl/libs/readxl.so 1.2M /usr/local/lib/R/site-library/reshape2/libs/reshape2.so 160K /usr/local/lib/R/site-library/rlang/libs/rlang.so 928K /usr/local/lib/R/site-library/scales/libs/scales.so 4.9M /usr/local/lib/R/site-library/stringi/libs/stringi.so 1.3M /usr/local/lib/R/site-library/tibble/libs/tibble.so 2.0M /usr/local/lib/R/site-library/tidyr/libs/tidyr.so 1.2M /usr/local/lib/R/site-library/tidyselect/libs/tidyselect.so 4.7M /usr/local/lib/R/site-library/xml2/libs/xml2.so 78M total root@de443801b3fc:/#
Looks like dplyr wins this one at thirty-three megabytes just for its shared library.
But with a single stroke of
strip we can reduce all this down a lot:
root@de443801b3fc:/# strip --strip-debug /usr/local/lib/R/site-library/*/libs/*so root@de443801b3fc:/# du -csh /usr/local/lib/R/site-library/*/libs/*so 440K /usr/local/lib/R/site-library/Rcpp/libs/Rcpp.so 220K /usr/local/lib/R/site-library/bindrcpp/libs/bindrcpp.so 52K /usr/local/lib/R/site-library/colorspace/libs/colorspace.so 56K /usr/local/lib/R/site-library/curl/libs/curl.so 120K /usr/local/lib/R/site-library/digest/libs/digest.so 2.5M /usr/local/lib/R/site-library/dplyr/libs/dplyr.so 16K /usr/local/lib/R/site-library/glue/libs/glue.so 404K /usr/local/lib/R/site-library/haven/libs/haven.so 76K /usr/local/lib/R/site-library/jsonlite/libs/jsonlite.so 20K /usr/local/lib/R/site-library/lazyeval/libs/lazyeval.so 24K /usr/local/lib/R/site-library/lubridate/libs/lubridate.so 8.0K /usr/local/lib/R/site-library/mime/libs/mime.so 52K /usr/local/lib/R/site-library/mnormt/libs/mnormt.so 84K /usr/local/lib/R/site-library/openssl/libs/openssl.so 76K /usr/local/lib/R/site-library/plyr/libs/plyr.so 32K /usr/local/lib/R/site-library/purrr/libs/purrr.so 648K /usr/local/lib/R/site-library/readr/libs/readr.so 400K /usr/local/lib/R/site-library/readxl/libs/readxl.so 128K /usr/local/lib/R/site-library/reshape2/libs/reshape2.so 56K /usr/local/lib/R/site-library/rlang/libs/rlang.so 100K /usr/local/lib/R/site-library/scales/libs/scales.so 496K /usr/local/lib/R/site-library/stringi/libs/stringi.so 124K /usr/local/lib/R/site-library/tibble/libs/tibble.so 164K /usr/local/lib/R/site-library/tidyr/libs/tidyr.so 104K /usr/local/lib/R/site-library/tidyselect/libs/tidyselect.so 344K /usr/local/lib/R/site-library/xml2/libs/xml2.so 6.6M total root@de443801b3fc:/#
Down to six point six megabytes. Not bad for one command. The chart visualizes the respective reductions. Clearly, C++ packages (and their template use) lead to more debugging symbols than plain old C code. But once stripped, the size differences are not that large.
And just to be plain, what we showed previously in post #9 does the same, only already at installation stage. The effects are not cumulative.