Comparing performance in R, foreach/doSNOW, SAS, and NumPY (MKL)

June 17, 2012

(This article was first published on Adventures in Statistical Computing, and kindly contributed to R-bloggers)

This is a follow up to my previous post.  There is a quicker way to compute the function I created (basic cumulative sum) in R.

Instead of:

function f(x) {
   sum = 0;
   for (i in seq(1,x)) sum = sum + i

Use this:

f2 = function(x){

If I time it, we see:

system.time( (out = apply(as.array(seq(10000)),1,f2)))
   user  system elapsed
   0.35    0.05    0.39

Nice!  Spread that across 3 CPUs and we can bring it down a bit:

system.time( (out2 =  foreach(i=seq(0,9),.combine=’c’) %dopar% {
   user  system elapsed
   0.02    0.00    0.26

Not too shabby.  How fast can we do this in SAS:

procfcmp outlib=work.fns.fns;
function csum(x);
  sum = 0;
  do i=1to x;
     sum = sum+i;
  return (sum);
doi=1 to 10000;
      x = csum(i);
NOTE: DATA statement used (Total process time):
      real time           0.24 seconds
      cpu time            0.25 seconds

SAS on a single CPU is just as fast as R on 3.  It’s not worth attempting to multi-thread this in SAS.  The overhead would be too much as SAS/CONNECT is made for bigger problems.

So what about NumPY in Python?  If we use the version compiled with MKL we ought to be able to do reduction in blazing fast time.  MKL should use the SSE registers on the processor.  Further, we’ll use the “fromfunction” method that lets us pass a lambda to the array creation method.

import numpy as np
import time as time
def f(x,y):
   x = x +1
s = time.time()
y = np.fromfunction(f,(10000,1))
el = time.time() – s
print “%0.6f” % el



To leave a comment for the author, please follow the link and comment on their blog: Adventures in Statistical Computing. offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.


Mango solutions

plotly webpage

dominolab webpage

Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training




CRC R books series

Six Sigma Online Training

Contact us if you wish to help support R-bloggers, and place your banner here.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)