How to learn C++ as an R user?
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Because of delays with my scholarship payment, if this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee that shall be used to continue my Open Source efforts. If you need an R package or Shiny dashboard for your team, you can email me or inquiry on Fiverr. The full explanation is here: A Personal Message from an Open Source Contributor
You can send me questions for the blog using this form.
I got this question from a reader: How to learn C++ as an R user?
I am not a teacher, so I will try to respond based on my own learning process. Back in 2023, I was trying to speed-up a code I wrote to conduct a General Equilibrium Pseudo Poisson Maximum Likelihood (GEPPML) simulation. After checking each part of the code –which consisted in multiple linear algebra operations– and implementing changes to avoid inefficiencies such as rewriting objects, using for loops instead of mapply()
, using a double for loop instead of the outer()
function and similars it was clear that C++ could be a better option in terms of speed.
Having used RcppArmadillo previously, I asked a reasonble question on Stackoverflow only to receive negative comments such as “you are dumb” and others in the same line, I started reading about a new R package called cpp11 which was created as an alternative to Rcpp. Once again, I went back to Stackoverflow when another user wrote “go cry somewhere else” to my question.
After looking for online resources that could help me with my questions, I started looking for online tutors and I could not find a person with experience in C++ and a reasonable grasp of linear algebra. My solution was to enroll in ECE244 (Programming Fundamentals) and dedicate 1-2 hours daily to transcribe my class notes and think about how to use what I was learning there for my particular problem. It was great to take ECE244 with Professor Salma Emara, and I am glad to have attended her lectures that covered C++ fundamentals.
Some people argue that you must learn C before C++. In my case, I know very little about C and I never took a formal course about C before learning C++. Prof. Emara has two books, one about C and another about C++ that is similar to ECE244 contents. I would say that C is to C++ is what S-Plus is to R, knowing S-Plus can help a bit to understand R syntax with multiple differences.
As I mentioned before, I needed to solve linear algebra problems but the starting point is to get a working C++ compiler. I’ve heard that can be tricky on Windows but both Linux and Mac provide it by default (e.g., the g++ command in the terminal). I organized the steps that worked for me from compiling a simple program to using Armadillo, a dedicated linear algebra library for C++, in the free ebook C++ for R Users. What I found useful was to do things I already wrote in R using C++, and so I organized the R codes from Hansen’s Econometrics textbook using Armadillo with R integration in the hansen package.
I think it is crucial to follow a gradual approach, starting with simple functions such as:
#include <iostream> int main() { int a = 1; std::cout << a + 1 << std::endl; return 0; }
which can be compiled with g++ plus1.cpp -o test
(or other variations such as g++ in.cpp -o out
) and then see the result with ./test
(or ./out
).
The same in R would be:
main <- function() { a <- 1 print(a + 1) }
Not including iostream to print “a+1” would result in a compiler error, and we would also get an error if we do not return 0 in C++ that would be similar to returning “TRUE” in R even when it is not required.
A literal translation of the C++ code would be:
main <- function() { a <- 1 print(a + 1) return(TRUE) }
In R we can also do this:
main = function() { a = 1L print(a + 1) }
However, in C++ 1 means 1L in R. The default in C++ is that numbers are integers, while in R the default is a double
> x = 1 > y = 1L > z = 1.0 > class(x) [1] "numeric" > class(y) [1] "integer" > class(z) [1] "numeric"
In C++ we can do something like:
int a = 1; double b = 2.0; double c = a + b;
which would be different in terms of storage if we use int c = a + b;
.
C++ is a language that enforces proper typing and missing a ;
would result in warnings or compiler errors. The only way to avoid these is to practise and be consistent.
What helped me to learn C++ was to:
- Write down in words what I need to do
- Comment my code
- Not think about efficiency first but about correctness
Once I got correct results I started thinking about things such as:
- Does my code use memory efficiently?
- Am I overwriting objects?
- Am I duplicating data?
I consider that C++ is a beautiful and useful language. It can be challenging at first but I remember when I learned R back in 2015 and everything was challenging.
Once you get simple examples to run, you can (and should) explore pointers, pass by value, pass by reference and really interesting C++ features that allow us to be really efficient in terms of memory usage if we need to. These concepts can be challenging as R (nor Python) provide these. The key is:
Once you can confidently write C++ code you should start thinking about R and C++ integration using the cpp11 package or cpp11armadillo. If you have experience creating R packages, perhaps you can start with R and C++ integration with simple examples.
I once struggled building a simple program as the example I showed in this post but then I went crying somewhere else as the Stackoverflow user said, I asked Prof. Emara lots of questions, searched YouTube videos and started thinking about how to rewrite my own R codes using C++.
Since then, I published four scientific articles with a strong software component where I used C++ to solve a problem:
Vargas Sepúlveda, Mauricio. 2025. “Capybara: Efficient Estimation of Generalized Linear Models with High-Dimensional Fixed Effects.” PLOS ONE Publicly available on 2025-08-25. https://doi.org/10.1371/journal.pone.0331178.
Vargas Sepúlveda, Mauricio. 2025. “Kendallknight: An R Package for Efficient Implementation of Kendall’s Correlation Coefficient Computation.” PLOS ONE 20 (6): e0326090. https://doi.org/10.1371/journal.pone.0326090.
Vargas Sepúlveda, Mauricio and Schneider Malamud, Jonathan. 2025. “cpp11armadillo: An R package to use the Armadillo C++ library.” SoftwareX 30 (May): 102087. https://doi.org/10.1016/j.softx.2025.102087.
Vargas Sepúlveda, Mauricio and Barkai, Lital. 2025. “The REDATAM format and its challenges for data access and information creation in public policy.” Data & Policy 7 (January): e18. https://dx.doi.org/10.1017/dap.2025.4.
I hope this is useful 🙂
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.