#9: Compacting your Shared Libraries

[This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Welcome to the nineth post in the recognisably rancid R randomness series, or R4 for short. Following on the heels of last week’s post, we aim to look into the shared libraries created by R.

We love the R build process. It is robust, cross-platform, reliable and rather predicatable. It. Just. Works.

One minor issue, though, which has come up once or twice in the past is the (in)ability to fully control all compilation options. R will always recall CFLAGS, CXXFLAGS, … etc as used when it was compiled. Which often entails the -g flag for debugging which can seriously inflate the size of the generated object code. And once stored in ${RHOME}/etc/Makeconf we cannot on the fly override these values.

But there is always a way. Sometimes even two.

The first is local and can be used via the (personal) ~/.R/Makevars file (about which I will have to say more in another post). But something I have been using quite a bite lately uses the flags for the shared library linker. Given that we can have different code flavours and compilation choices—between C, Fortran and the different C++ standards—one can end up with a few lines. I currently use this which uses -Wl, to pass an the -S (or --strip-debug) option to the linker (and also reiterates the desire for a shared library, presumably superfluous):

SHLIB_CXXLDFLAGS = -Wl,-S -shared
SHLIB_CXX11LDFLAGS = -Wl,-S -shared
SHLIB_CXX14LDFLAGS = -Wl,-S -shared
SHLIB_FCLDFLAGS = -Wl,-S -shared
SHLIB_LDFLAGS = -Wl,-S -shared

Let’s consider an example: my most recently uploaded package RProtoBuf. Built under a standard 64-bit Linux setup (Ubuntu 17.04, g++ 6.3) and not using the above, we end up with library containing 12 megabytes (!!) of object code:

edd@brad:~/git/rprotobuf(feature/fewer_warnings)$ ls -lh src/RProtoBuf.so
-rwxr-xr-x 1 edd edd 12M Aug 14 20:22 src/RProtoBuf.so
edd@brad:~/git/rprotobuf(feature/fewer_warnings)$ 

However, if we use the flags shown above in .R/Makevars, we end up with much less:

edd@brad:~/git/rprotobuf(feature/fewer_warnings)$ ls -lh src/RProtoBuf.so 
-rwxr-xr-x 1 edd edd 626K Aug 14 20:29 src/RProtoBuf.so
edd@brad:~/git/rprotobuf(feature/fewer_warnings)$ 

So we reduced the size from 12mb to 0.6mb, an 18-fold decrease. And the file tool still shows the file as ‘not stripped’ as it still contains the symbols. Only debugging information was removed.

What reduction in size can one expect, generally speaking? I have seen substantial reductions for C++ code, particularly when using tenmplated code. More old-fashioned C code will be less affected. It seems a little difficult to tell—but this method is my new build default as I continually find rather substantial reductions in size (as I tend to work mostly with C++-based packages).

The second option only occured to me this evening, and complements the first which is after all only applicable locally via the ~/.R/Makevars file. What if we wanted it affect each installation of a package? The following addition to its src/Makevars should do:

strippedLib: $(SHLIB)
        if test -e "/usr/bin/strip"; then /usr/bin/strip --strip-debug $(SHLIB); fi

.phony: strippedLib

We declare a new Makefile target strippedLib. But making it dependent on $(SHLIB), we ensure the standard target of this Makefile is built. And by making the target .phony we ensure it will always be executed. And it simply tests for the strip tool, and invokes it on the library after it has been built. Needless to say we get the same reduction is size. And this scheme may even pass muster with CRAN, but I have not yet tried.

Lastly, and acknowledgement. Everything in this post has benefited from discussion with my former colleague Dan Dillon who went as far as setting up tooling in his r-stripper repository. What we have here may be simpler, but it would not have happened with what Dan had put together earlier.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

To leave a comment for the author, please follow the link and comment on their blog: Thinking inside the box .

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)