Predicting the memory usage of an R object containing numbers

August 15, 2012
By

(This article was first published on NumberTheory » R stuff, and kindly contributed to R-bloggers)

To estimate if a certain vector of numbers will fit into memory, you can quite easily predict the memory usage based on the size of the vector. An integer vector will use 4 bytes per number, and a numeric vector 8 bytes (double precision float). The following function prints the estimated memory usage of a vector based on the size of the vector and the type of vector:

predict_data_size = function(numeric_size, number_type = "numeric") {
  if(number_type == "integer") {
    byte_per_number = 4
  } else if(number_type == "numeric") {
    byte_per_number = 8
  } else {
    stop(sprintf("Unknown number_type: %s", number_type))
  }
  estimate_size_in_bytes = (numeric_size * byte_per_number)
  class(estimate_size_in_bytes) = "object_size"
  print(estimate_size_in_bytes, units = "auto")
}

For example:

> predict_data_size(1518*1518, "numeric")
17.6 Mb
> predict_data_size(1518*1518, "integer")
8.8 Mb
>

To print the size of the vector in a nice format, I change the class of estimate_size_in_bytes to "object_size". In this way if I call print on the object, R will call print.object_size (see utils:::print.object_size for the source), which performs the formatting.

You can also use this function to estimate the size of matrices and multi-dimensional arrays, it is the total size which matters. Note that the R object (vector, matrix, array) will take a little more space if it uses metadata (e.g. dimnames), but for any decently sized object this is probably small compared to the size of the numbers.

To leave a comment for the author, please follow the link and comment on his blog: NumberTheory » R stuff.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.