# Quick and dirty parallel processing in R

R has some powerful tools for parallel processing, which I discovered while searching for ways to fully utilize my 8-core computer at work. What surprised me is how easy it is…about 6 lines of code, if that. Given that I wasn’t allowed to install heavy duty parallel-processing systems like MPICH on the computer, I found that the library **SNOW** fit the bill nicely through its use of sockets. I also discovered the libraries **foreach** and **iterators**, which were released to the community by the development team at Revolution R. Using these 3 libraries, I could easily parallelize a transformation of my dataset where the transformations happened within each unique ID. The following code did the trick: ```
library(foreach) library(doSNOW)
cl <- makeCluster(6, type="SOCK") # using 6 nodes registerDoSNOW(cl) uID <- unique(ID) foreach(i=icount(length(uID)) %dopar% { transformData(dat[dat$ID==uID[i],]) } stopCluster(cl)
```

Note that this is for a *multiprocessor single computer*. Doing this on a cluster may be more complicated, but this serves my purposes quite nicely. There are other choices for this, including the **multicore** library and others described in the CRAN Task View

**Update**: I found that this strategy did not work for R 2.11 Windows versions, since `snow`

is not properly spawning processes. However, there is a library ` doSMP`

provided by *Revolution Analytics* which gets around this problem. So replacing `doSNOW`

with `doSMP`

should do the trick.

