Profiling Rcpp code on Unix/Mac is easy, but is difficult on Windows because R uses a compilation toolchain (MinGW) that produces files that are not understood by common Windows profiling programs. Additionally, the R build process often removes symbols which allow profilers to produce sensible interpretations of their data. The following steps allow one to profile Rcpp code on windows.
Change compilation settings to add in symbol settingsA default R installation typically has certain compiler settings placed in the equivalent of the
C:\Program Files\R\R-3.3.1\etc\x64\Makeconfthat strips information needed for profiling during the Rcpp compilation process, in particular a line which reads:
DLLFLAGS=-s. To override this and add some additionally needed flags, one should add a folder and file to their home directory which overrides and appends necessesary compilation flags. To a file located at a location equivalent to
C:\Users\YOURNAME\.R\Makevarson your machine (note the ‘.’ before R), add the following lines:
CXXFLAGS+=-gdwarf-2 DLLFLAGS=You can verify this worked correctly by checking that
-gdwarf-2appears in the compilation messages, and that
-sis missing in the final linker step.
Run a profiler which understands MinGW compiled codeThe next key step is to run a profiler which can understand the Unix like symbols on windows. Two free and good options are Very Sleepy and AMD’s code analyst (which also works on Intel chips). Very Sleepy is very good at basic timings and providing stack traces, while AMD’s profiler is able to drill down to the assembly of a process. Both profilers are good but an example with AMD is shown below.
- Open the program and setup a quick session to start and run a sample R script that uses your code, such as in the example shown below.
- Next run the profiler and get ready to look at results. For example, here I can see that half the time was spent in my code, versus half in the R core’s code (generating random numbers)And digging further down I can see at the assembly level what the biggest bottlenecks were in my code