# Simple example:How to use foreach and doSNOW packages for parallel computation.

February 6, 2011
By

(This article was first published on My Life as a Mock Quant in English, and kindly contributed to R-bloggers)

update

************************************************************************************************
I checked whether this example was run collectly or not in Windows XP(32bit) only
************************************************************************************************

In R language, the members at Revolution R provide foreach and doSNOW packages for parallel computation. these packages allow us to compute things in parallel. So, we start to install these packages.

`install.packages("foreach")install.packages("doSNOW")`

In foreach package, you can write the codes which are run not only in parallel but also in sequence. And, these are as following.

`library(foreach)#we get result as listforeach(i = 1:3) %do% {sqrt(i)}#we get result as vector with using .combine="c" optionforeach(i = 1:3,.combine = "c") %do% {sqrt(i)}#if a result is "vector",we can get it as matrix with using .combine="cbind" optionforeach(i = 1:3,.combine = "cbind") %do% {letters[1:4]}#if you define a function,you can use it as .combine option#I wrote my function as returning same result that specify .combine="c" MyFunc <- function(x,y)c(x,y)foreach(i = 1:3, .combine = "MyFunc") %do% {  sqrt(i)}`

Next, we make clusters by doSNOW package for the purpose of parallel computation.
Because I have dual core machine, I specify two as the number of clusters.

`> library(doSNOW)> getDoParWorkers()[1] 1> getDoParName()NULL> registerDoSNOW(makeCluster(2, type = "SOCK"))> getDoParWorkers()[1] 2> getDoParName()[1] "doSNOW"> getDoParVersion()[1] "1.0.3"`

Now, We are ready to compute things in parallel. It is easy for us to do that by foreach package. You only have to change “%do%” into “%dopar%”. I compared the performance of parallel comutation to single computation as following.

`> N <- 10^4> system.time(foreach(i = 1:N,.combine = "cbind") %do% {+   sum(rnorm(N))+ })   ユーザ   システム       経過     57.52       0.48      59.60> system.time(foreach(i = 1:N,.combine = "cbind") %dopar% {+   sum(rnorm(N))+ })   ユーザ   システム       経過     18.61       0.58      37.74`

(I’m sorry that some terms are written in Japanese!)

You can understand the result of parallel computation is about twice as fast as single computation do !!!

Reference（including PDF）
http://cran.r-project.org/web/packages/foreach/foreach.pdf
http://cran.r-project.org/web/packages/foreach/vignettes/foreach.pdf
http://cran.r-project.org/web/packages/foreach/vignettes/nested.pdf

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...