Scheduling R scripts and processes on Windows and Unix/Linux
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
2 new R packages were put on CRAN last week by BNOSAC (www.bnosac.be).
- One package for scheduling R scripts and processes on Windows (taskscheduleR) and
- Another package for scheduling R scripts and processes on Unix / Linux (cronR)
These 2 packages allow you to schedule R processes from R directly. This is done by passing commands directly to cron which is a basic Linux/Unix job scheduling utility or by using the Windows Task Scheduler. The packages were developed for beginning R users who are unaware of that fact that R scripts can also be run non-interactively and can be automated.
We blogged already about the taskscheduleR R package at this blog post and also here. This time we devote some more details to the cronR R package.
The cronR package allows to
- Get the list of scheduled jobs
- Remove scheduled jobs
- Add a job
- a job is basically a script with R code which is run through Rscript
- You can schedule tasks ‘ONCE’, ‘EVERY MINUTE’, ‘EVERY HOUR’, ‘EVERY DAY’, ‘EVERY WEEK’, ‘EVERY MONTH’ or any complex schedule
- The task log contains the stdout & stderr of the Rscript which was run on that timepoint. This log can be found at the same folder as the R script
The package is especially suited for persons working on an RStudio server in the cloud or within the premises of their corporate environment. It allows to easily schedule processes. To make that extremely easy for beginning R users, an RStudio addin was developed, which is shown in the example below. The RStudio addin basically allows you to select an R script and schedule it at specific timepoints. It does this by copying the script to your launch/log folder and setting up a cronjob for that script.
The example below shows how to set up a cron job using the RStudio addin so that the scripts are launched every minute or every day at a specific hour. The R code is launched through Rscript and the log will contain the errors and the warnings in case your script failed so that you can review where the code failed.
Mark that you can also pass on arguments to the R script so that you can launch the same script for productXYZ and productABC.
Of course scheduling scripts can also be done from R directly. Some examples are shown below. More information at https://github.com/bnosac/cronR
library(cronR) f <- system.file(package = "cronR", "extdata", "helloworld.R") cmd <- cron_rscript(f, rscript_args = c("productx", "20160101")) ## Every minute cron_add(cmd, frequency = 'minutely', id = 'job1', description = 'Customers') ## Every hour at 20 past the hour on Monday and Tuesday cron_add(cmd, frequency = 'hourly', id = 'job2', at = '00:20', description = 'Weather', days_of_week = c(1, 2)) ## Every day at 14h20 on Sunday, Wednesday and Friday cron_add(cmd, frequency = 'daily', id = 'job3', at = '14:20', days_of_week = c(0, 3, 5)) ## Every starting day of the month at 10h30 cron_add(cmd, frequency = 'monthly', id = 'job4', at = '10:30', days_of_month = 'first', days_of_week = '*') ## Get all the jobs cron_ls() ## Remove all scheduled jobs cron_clear(ask=FALSE)
We hope this will gain you some precious time and if you need more help on automating R processes, feel free to get into contact. We have a special training devoted to managing R processes which can be given in your organisation. More information at our training curriculum.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.