Input/output benchmarks

January 22, 2017
By

(This article was first published on R – Why?, and kindly contributed to R-bloggers)

To quantify the impact of the CPU on an analysis, I created the package benchmarkme. The idea is simple. If everyone runs the same R script, we can easily compare machines.

One of the benchmarks in the package is for comparing read/write speeds; we write a large CSV file (using write.csv) and read it back in using read.csv

The package is on CRAN can be installed in the usual way

install.packages("benchmarkme")

Running

library(benchmarkme)
## If your computer is relatively slow, remove 200 from below
res = benchmark_io(runs = 3, size = c(5, 50, 200))
## Upload you data set
upload_results(res)

creates three matrices of size 5MB, 20MB and 200MB, writes the associated CSV file to the directory

Sys.getenv("TMPDIR")

and then reads the data set back into R. The object res contains the timings which can compared to other users via

plot(res)

rplot01

The above graph plots the current benchmarking results for writing a 5MB file (my machine is relatively fast).

Shiny

You can also compare your results using the Shiny interface. Simply create a results bundle

 create_bundle(res, filename = "results.rds")

and upload to the webpage.

Network drives

Often the dataset we wish to access is on a network drive. Unfortunately, network drives can be slow. The benchmark_io function has an argument that allows us to change the directory and estimate the network drive impact

res_net = benchmark_io(runs = 3, size = c(5, 20, 200), 
                           tmpdir = "path_to_dir")

 

To leave a comment for the author, please follow the link and comment on their blog: R – Why?.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)