drake R package is not only a reproducible research solution, but also a serious high-performance computing engine. The Get Started page introduces
drake, and this technical note draws from the guides on high-performance computing and timing.
You can help!
Some of these features are brand new, and others are newly refactored. The GitHub version has all the advertised functionality, but it needs more testing and development before I can submit it to CRAN in good conscience. New issues such as r-lib/processx#113 and HenrikBengtsson/future#226 seem to affect
drake, and more may emerge. If you use
drake for your own work, please consider supporting the project by field-testing the claims below and posting feedback here.
Let drake schedule your targets.
A typical workflow is a sequence of interdependent data transformations. Consider the example from the Get Started page.
When you call
make() on this project,
drake takes care of
raw_data, and then
data in sequence. Once
hist can launch in parallel, and then
"report.md" begins once everything else is done. It is
drake’s responsibility to deduce this order of execution, hunt for ways to parallelize your work, and free you up to focus on the substance of your research.
Activate parallel processing.
Simply set the
jobs argument to an integer greater than 1. The following
make() recruits multiple processes on your local machine.
make(plan, jobs = 2)
For parallel deployment to a computing cluster (SLURM, TORQUE, SGE, etc.)
drake calls on packages
future.batchtools. First, create a
batchtools template file to declare your resource requirements and environment modules. There are built-in example files in
drake, but you will likely need to tweak your own by hand.
drake_batchtools_tmpl_file("slurm") # Writes batchtools.slurm.tmpl.
future.batchtools to talk to the cluster.
library(future.batchtools) future::plan(batchtools_slurm, template = "batchtools.slurm.tmpl")
parallelism argument equal to
make(plan, parallelism = "future", jobs = 8)
Choose a scheduling algorithm.
parallelism argument of
make() controls not only where to deploy the workers, but also how to schedule them. The following table categorizes the 7 options.
|Deploy: local||Deploy: remote|
|Schedule: persistent||“mclapply”, “parLapply”||“future_lapply”|
|Schedule: transient||“future”, “Makefile”|
|Schedule: staged||“mclapply_staged”, “parLapply_staged”|
drake’s first custom parallel algorithm was staged scheduling. It was easier to implement than the other two, but the workers run in lockstep. In other words, all the workers pick up their targets at the same time, and each worker has to finish its target before any worker can move on. The following animation illustrates the concept.
But despite weak parallel efficiency, staged scheduling remains useful because of its low overhead. Without the bottleneck of a formal master process, staged scheduling blasts through armies of tiny conditionally independent targets (example here). Consider it if the bulk of your work is finely diced and perfectly parallel, maybe if your dependency graph is tall and thin.
Persistent scheduling is brand new to
make(jobs = 2) deploys three processes: two workers and one master. Whenever a worker is idle, the master assigns it the next target whose dependencies are fully ready. The workers keep running until no more targets remain. See the animation below.
If the time limits of your cluster are too strict for persistent workers, consider transient scheduling, another new arrival. Here,
make(jobs = 2) starts a brand new worker for each individual target. See the following video.
How many jobs should you choose?
predict_runtime() function can help. Let’s revisit the
- Plan for non-staged scheduling,
- Assume each non-file target (black circle) takes 2 hours to build, and
- Rest assured that everything else is super quick.
When we declare the runtime assumptions with the
known_times argument and cycle over a reasonable range of
predict_runtime() paints a clear picture.
jobs = 4 is a solid choice. Any fewer would slow us down, and the next 2-hour speedup would take double the
jobs and the hardware to back it up. Your choice of
make() ultimately depends on the runtime you can tolerate and the computing resources at your disposal.
When I attended
drake relied almost exclusively on staged scheduling. Kirill Müller spent hours on site and hours afterwards helping me approach the problem and educating me on priority queues, message queues, and the knapsack problem. His generous help paved the way for
drake’s latest enhancements.
This post is a product of my own personal experiences and opinions and does not necessarily represent the official views of my employer. I created and embedded the Powtoon videos only as explicitly permitted in the Terms and Conditions of Use, and I make no copyright claim to any of the constituent graphics.