Simple example:How to use foreach and doSNOW packages for parallel computation.

[This article was first published on My Life as a Mock Quant in English, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

update

************************************************************************************************
I checked whether this example was run collectly or not in Windows XP(32bit) only
************************************************************************************************



In R language, the members at Revolution R provide foreach and doSNOW packages for parallel computation. these packages allow us to compute things in parallel. So, we start to install these packages.
install.packages("foreach")
install.packages("doSNOW")


In foreach package, you can write the codes which are run not only in parallel but also in sequence. And, these are as following.
library(foreach)
#we get result as list
foreach(i = 1:3) %do% {sqrt(i)}
#we get result as vector with using .combine="c" option
foreach(i = 1:3,.combine = "c") %do% {sqrt(i)}
#if a result is "vector",we can get it as matrix with using .combine="cbind" option
foreach(i = 1:3,.combine = "cbind") %do% {letters[1:4]}
#if you define a function,you can use it as .combine option
#I wrote my function as returning same result that specify .combine="c" 
MyFunc <- function(x,y)c(x,y)
foreach(i = 1:3, .combine = "MyFunc") %do% {
  sqrt(i)
}


Next, we make clusters by doSNOW package for the purpose of parallel computation.
Because I have dual core machine, I specify two as the number of clusters.
> library(doSNOW)
> getDoParWorkers()
[1] 1
> getDoParName()
NULL
> registerDoSNOW(makeCluster(2, type = "SOCK"))
> getDoParWorkers()
[1] 2
> getDoParName()
[1] "doSNOW"
> getDoParVersion()
[1] "1.0.3"


Now, We are ready to compute things in parallel. It is easy for us to do that by foreach package. You only have to change "%do%" into "%dopar%". I compared the performance of parallel comutation to single computation as following.
> N <- 10^4
> system.time(foreach(i = 1:N,.combine = "cbind") %do% {
+   sum(rnorm(N))
+ })
   ユーザ   システム       経過
     57.52       0.48      59.60
> system.time(foreach(i = 1:N,.combine = "cbind") %dopar% {
+   sum(rnorm(N))
+ })
   ユーザ   システム       経過
     18.61       0.58      37.74
(I'm sorry that some terms are written in Japanese!)

You can understand the result of parallel computation is about twice as fast as single computation do !!!


Reference(including PDF)
-http://cran.r-project.org/web/packages/foreach/foreach.pdf
-http://cran.r-project.org/web/packages/foreach/vignettes/foreach.pdf
-http://cran.r-project.org/web/packages/foreach/vignettes/nested.pdf

To leave a comment for the author, please follow the link and comment on their blog: My Life as a Mock Quant in English.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)