R for system administration and scripting

[This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

On several occassions, R had suggested itself as a language for systems scripting. By this I mean random little adminstrative task such as (re-)moving or maybe renaming files or directories and the like.

One of such cases just happened a few minutes ago. The aforementioned Garmin Forerunner 405 can cooperate quite nicely with Linux using the gant reader for the ant wireless communication protocol between the usb hardware dongle and the Garmin 405. (Sources for gant are both this file and this git archive.) I had meant to blog about this tool and the resulting files one of these days anyway, but today I just want to mention that the default filenames created by the program were somewhat horrid such as 20.09.2009 101112.TCX to denote the 20th of September of this year at 10:11h and 12 seconds. As we all know, filenames with spaces are bad for the environment as well as plain annoying. So I had made the simple change in the C sources to switch to a saner format such as 20090920-101112.TCX (and I see that the git archive now contains a similar fix). But that still left me with some 80+ files with the dreaded names.

There are of course many ways to skin this cat and to rename the files in bulk. However, I found the following four lines to be fairly succinct

#!/usr/bin/r
files <- dir(".", pattern=".*\\.TCX$")
res <- lapply(files, function(f) {
    pt <- strptime(f, "%d.%m.%Y %H%M%S.TCX")  # parsed time
    ft <- strftime(pt, "%Y%m%d-%H%M%S.TCX")   # formatted time
    file.rename(f, ft)
})
as they show, among other things,
  • the access to one of the three (soon four) regexp engines, here as a simple patterns argument to dir()
  • the functional programming nature of the beast: files is a vector of filenames, and lapply() unrolls the vector one-by-one calling the anonymous function and passing the current element off as f
  • computing on times is particularly easy as we get strptime and strftime as any self- and POSIX-respecting language should
  • similarly, we get access to file system-level operations natively avoiding all quoting issues that make files with spaces such fun in the first place.
  • the littler scripting frontend providing /usr/bin/r rules.
So about five lines and two minutes later, some eighty-ish files were renamed and sanity was restored. Hm, and I took me five times as long to blog this.

Lastly, I do not mean to imply that Python or Perl or Ruby or (insert favourite tool here) cannot do it equally well. I simply meant to say that programmatically creating new filenames is definitely easier in R than it would have been in shell. And as an added bonus, we even get fully parsed time objects that I could have tested for. But then tests and documentation never get written on a Saturday.

To leave a comment for the author, please follow the link and comment on their blog: Thinking inside the box .

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)