The Many-Faced Future

[This article was first published on jottR, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The future package defines the Future API, which is a unified, generic, friendly API for parallel processing. The Future API follows the principle of write code once and run anywhere – the developer chooses what to parallelize and the user how and where.

The nature of a future is such that it lends itself to be used with several of the existing map-reduce frameworks already available in R. In this post, I’ll give an example of how to apply a function over a set of elements concurrently using plain sequential R, the parallel package, the future package alone, as well as future in combination of the foreach, the plyr, and the purrr packages.


You can choose your own future and what you want to do with it.

Example: Multiple Mandelbrot sets

The Julia package provides the JuliaImage() function for generating a Julia set for a given set of start parameters (centre, L, C) where centre specify the center point in the complex plane, L specify the width and height of the square region around this location, and C is a complex coefficient controlling the “shape” of the generated Julia set. For example, to generate one of the above Julia set images (1000-by-1000 pixels), you can use:
library("Julia")
set <- JuliaImage(1000, centre = 0 + 0i, L = 3.5, C = -0.4 + 0.6i)
plot_julia(set)
with
plot_julia <- function(img, col = topo.colors(16)) {
  par(mar = c(0, 0, 0, 0))
  image(img, col = col, axes = FALSE)
}
For the purpose of illustrating how to calculate different Julia sets in parallel, I will use the same (centre, L) = (0 + 0i, 3.5) region as above with the following ten complex coeffients (from Julia set):
Cs <- list(
  a = -0.618   + 0i,
  b = -0.4     + 0.6i,
  c =  0.285   + 0i,
  d =  0.285   + 0.01i,
  e =  0.45    + 0.1428i,
  f = -0.70176 - 0.3842i,
  g =  0.835   - 0.2321i,
  h = -0.8     + 0.156i,
  i = -0.7269  + 0.1889i,
  j =  0       - 0.8i
)
Now we're ready to see how we can use futures in combination of different map-reduce implementations in R for generating these ten sets in parallel. Note that all approaches will generate the exact same ten Julia sets. So, feel free to pick your favorite approach.

Sequential

To process the above ten regions sequentially, we can use the lapply() function:
library("Julia")
sets <- lapply(Cs, function(C) {
  JuliaImage(1000, centre = 0 + 0i, L = 3.5, C = C)
})

Parallel

library("parallel")
ncores <- future::availableCores() ## a friendly version of detectCores()
cl <- makeCluster(ncores)

clusterEvalQ(cl, library("Julia"))
sets <- parLapply(cl, Cs, function(C) {
  JuliaImage(1000, centre = 0 + 0i, L = 3.5, C = C)
})

Futures (in parallel)

library("future")
plan(multisession)  ## defaults to availableCores() workers

library("Julia")
sets <- future_lapply(Cs, function(C) {
  JuliaImage(1000, centre = 0 + 0i, L = 3.5, C = C)
})
We could also have used the more explicit setup plan(cluster, workers = makeCluster(availableCores())), which is identical to plan(multisession).

Futures with foreach

library("doFuture")
registerDoFuture()  ## tells foreach futures should be used
plan(multisession)  ## specifies what type of futures

library("Julia")
sets <- foreach(C = Cs) %dopar% {
  JuliaImage(1000, centre = 0 + 0i, L = 3.5, C = C)
}
Note that I didn't pass .packages = "Julia" to foreach() because the doFuture backend will do that automatically for us - that's one of the treats of using futures. If we would have used doParallel::registerDoParallel(cl) or similar, we would have had to worry about that.

Futures with plyr

The plyr package will utilize foreach internally if we pass .parallel = TRUE. Because of this, we can use plyr::llply() to parallelize via futures as follows:
library("plyr")
library("doFuture")
registerDoFuture()  ## tells foreach futures should be used
plan(multisession)  ## specifies what type of futures

library("Julia")
sets <- llply(Cs, function(C) {
  JuliaImage(1000, centre = 0 + 0i, L = 3.5, C = C)
}, .parallel = TRUE)
For the same reason as above, also here we don't have to worry about global variables and making sure needed packages are attached; that's all handles by the future packages.

Futures with purrr (= furrr)

As a final example, here is how you can use futures to parallelize your purrr::map() calls:
library("purrr")
library("future")
plan(multisession)

library("Julia")
sets <- Cs %>%
        map(~ future(JuliaImage(1000, centre = 0 + 0i, L = 3.5, C = .x))) %>%
        values
Comment: This latter approach will not perform load balance ("scheduling") across backend workers; that's a feature that ideally would be taken care of by purrr itself. However, I have some ideas for future versions of future (pun...) that may achieve this without having to modify the purrr package.

Got compute?

If you have access to one or more machines with R installed (e.g. a local or remote cluster, or a Google Compute Engine cluster), and you've got direct SSH access to those machines, you can have those machines calculate the above Julia sets for you; just change future plan, e.g.
plan(cluster, workers = c("machine1", "machine2", "machine3.remote.org"))
If you have access to a high-performance compute (HPC) cluster with a HPC scheduler (e.g. Slurm, TORQUE / PBS, LSF, and SGE), you can harness its power by switching to:
library("future.batchtools")
plan(batchtools_sge)
For more details, see the vignettes of the future.batchtools and batchtools packages.

Happy futuring!


Links

See also

To leave a comment for the author, please follow the link and comment on their blog: jottR.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)