Simple example:How to use foreach and doSNOW packages for parallel computation.

February 6, 2011
By

(This article was first published on My Life as a Mock Quant in English, and kindly contributed to R-bloggers)

update

************************************************************************************************
I checked whether this example was run collectly or not in Windows XP(32bit) only
************************************************************************************************



In R language, the members at Revolution R provide foreach and doSNOW packages for parallel computation. these packages allow us to compute things in parallel. So, we start to install these packages.
Created by Pretty R at inside-R.org


In foreach package, you can write the codes which are run not only in parallel but also in sequence. And, these are as following.
library(foreach)
#we get result as list
foreach(i = 1:3) %do% {sqrt(i)}
#we get result as vector with using .combine="c" option
foreach(i = 1:3,.combine = "c") %do% {sqrt(i)}
#if a result is "vector",we can get it as matrix with using .combine="cbind" option
foreach(i = 1:3,.combine = "cbind") %do% {letters[1:4]}
#if you define a function,you can use it as .combine option
#I wrote my function as returning same result that specify .combine="c"
MyFunc <- function(x,y)c(x,y)
foreach(i = 1:3, .combine = "MyFunc") %do% {
sqrt(i)
}
Created by Pretty R at inside-R.org


Next, we make clusters by doSNOW package for the purpose of parallel computation.
Because I have dual core machine, I specify two as the number of clusters.
> library(doSNOW)
> getDoParWorkers()
[1] 1
> getDoParName()
NULL
> registerDoSNOW(makeCluster(2, type = "SOCK"))
> getDoParWorkers()
[1] 2
> getDoParName()
[1] "doSNOW"
> getDoParVersion()
[1] "1.0.3"
Created by Pretty R at inside-R.org


Now, We are ready to compute things in parallel. It is easy for us to do that by foreach package. You only have to change "%do%" into "%dopar%". I compared the performance of parallel comutation to single computation as following.
> N <- 10^4
> system.time(foreach(i = 1:N,.combine = "cbind") %do% {
+ sum(rnorm(N))
+ })
ユーザ システム 経過
57.52 0.48 59.60
> system.time(foreach(i = 1:N,.combine = "cbind") %dopar% {
+ sum(rnorm(N))
+ })
ユーザ システム 経過
18.61 0.58 37.74
Created by Pretty R at inside-R.org
(I'm sorry that some terms are written in Japanese!)

You can understand the result of parallel computation is about twice as fast as single computation do !!!


Reference(including PDF)
-http://cran.r-project.org/web/packages/foreach/foreach.pdf
-http://cran.r-project.org/web/packages/foreach/vignettes/foreach.pdf
-http://cran.r-project.org/web/packages/foreach/vignettes/nested.pdf

To leave a comment for the author, please follow the link and comment on his blog: My Life as a Mock Quant in English.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.