To quantify the impact of the CPU on an analysis, I created the package benchmarkme. The idea is simple. If everyone runs the same R script, we can easily compare machines.
One of the benchmarks in the package is for comparing read/write speeds; we write a large CSV file (using write.csv) and read it back in using read.csv
The package is on CRAN can be installed in the usual way
library(benchmarkme) ## If your computer is relatively slow, remove 200 from below res = benchmark_io(runs = 3, size = c(5, 50, 200)) ## Upload you data set upload_results(res)
creates three matrices of size 5MB, 20MB and 200MB, writes the associated CSV file to the directory
and then reads the data set back into R. The object res contains the timings which can compared to other users via
The above graph plots the current benchmarking results for writing a 5MB file (my machine is relatively fast).
You can also compare your results using the Shiny interface. Simply create a results bundle
create_bundle(res, filename = "results.rds")
and upload to the webpage.
Often the dataset we wish to access is on a network drive. Unfortunately, network drives can be slow. The benchmark_io function has an argument that allows us to change the directory and estimate the network drive impact
res_net = benchmark_io(runs = 3, size = c(5, 20, 200), tmpdir = "path_to_dir")