Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It all started off as a simple question from Scott Chamberlain on Twitter:

The goal was to create a matrix with randomly selected binary elements, and a predetermined number of rows and columns, that looks something like this:

     [,1] [,2] [,3] [,4]
[1,]    0    1    1    0
[2,]    0    0    0    1
[3,]    1    0    1    1


Many suggestions followed (including one from me). There were several different ways suggested of creating the random binary values:

• Use the runif function to create random numbers between 0 and 1, and round to the nearest whole number.
• Use ifelse on the output of runif, and assign 0 if it's below 0.5, and 1 otherwise.
• Use the rbinom function to sample from a binomial distribution with a size of 1 and probability 0.5
• Use the sample function with the replace=TRUE option to simulate selections of 0 and 1.

There were also different ways suggested for generating the matrix:

• Use a for loop to fill each element of the matrix individually.
• Generate random numbers row by row, and fill the matrix using apply.
• Generate all the random numbers at once, and use the “matrix” function to create the matrix directly.

Luis Apiolaza reviews the suggested methods. Each has its merits: in clarity of code, in elegance, and especially in performance. On that front, Dirk Eddelbuettel benchmarked several of the solutions, including translating the code into C++ using Rcpp. One surprising outcome: translating the problem into C++ is only somewhat faster than using one call to sample. As Dirk says, this shows that “well-written R code can be competitive” with machine code.

Thinking inside the Box: Faster creation of binomial matrices