# Predicting the memory usage of an R object containing numbers

August 15, 2012
By

(This article was first published on NumberTheory » R stuff, and kindly contributed to R-bloggers)

To estimate if a certain vector of numbers will fit into memory, you can quite easily predict the memory usage based on the size of the vector. An `integer` vector will use 4 bytes per number, and a `numeric` vector 8 bytes (double precision float). The following function prints the estimated memory usage of a vector based on the size of the vector and the type of vector:

```predict_data_size = function(numeric_size, number_type = "numeric") {
if(number_type == "integer") {
byte_per_number = 4
} else if(number_type == "numeric") {
byte_per_number = 8
} else {
stop(sprintf("Unknown number_type: %s", number_type))
}
estimate_size_in_bytes = (numeric_size * byte_per_number)
class(estimate_size_in_bytes) = "object_size"
print(estimate_size_in_bytes, units = "auto")
}```

For example:

```> predict_data_size(1518*1518, "numeric")
17.6 Mb
> predict_data_size(1518*1518, "integer")
8.8 Mb
>```

To print the size of the vector in a nice format, I change the class of `estimate_size_in_bytes` to `"object_size"`. In this way if I call `print` on the object, R will call `print.object_size` (see `utils:::print.object_size` for the source), which performs the formatting.

You can also use this function to estimate the size of matrices and multi-dimensional arrays, it is the total size which matters. Note that the R object (vector, matrix, array) will take a little more space if it uses metadata (e.g. dimnames), but for any decently sized object this is probably small compared to the size of the numbers.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...