R is not C

December 7, 2011

(This article was first published on Nathan VanHoudnos » rstats, and kindly contributed to R-bloggers)

I keep trying to write R code like it was C code. It is a habit I’m trying to break myself of.

For example, the other day I need to construct a model matrix of 1′s and 0′s in the standard, counting in binary, pattern. My solution was:

n <- 8
powers <- 2^(0:(n-1))
NN <- (max(powers)*2)
designMatrix <- matrix( NA, nrow=NN, ncol=n)
for( ii in 0:(NN-1) ) {
     leftOver <- ii
     for ( jj in 1:n ) {
          largest <- rev(powers)[jj]
          if ( leftOver != 0 && largest <= leftOver ) {
               designMatrix[ii+1,jj] <- 1	
               leftOver <- leftOver - largest
          } else {
               designMatrix[ii+1,jj] <- 0

The code works, but it is a low-level re-implementation of something that already exists in base R. R is not C, because base R has pieces that implement statistical ideas for you. Consider:

expand.grid                package:base                R Documentation

Create a Data Frame from All Combinations of Factors


     Create a data frame from all combinations of the supplied vectors
     or factors.  See the description of the return value for precise
     details of the way this is done.

So then instead of writing (and debugging!) a function to make a binary model matrix, I could have simply used a one-liner:

# Note that c(0,1) is encased in list() so that
# rep(..., n) will repeat the object c(0,1) n 
# times instead of its default behavior of 
# concatenating the c(0,1) objects. 
designMatrix_R <- as.matrix( expand.grid( rep( list(c(0,1) ), n) ) )

I like it. It is both shorter and easier to debug. Now I just need to figure out how to find these base R functions before I throw up my hands and re-implement them in C.

To leave a comment for the author, please follow the link and comment on his blog: Nathan VanHoudnos » rstats.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.