Size of each object in R’s workspace

[This article was first published on isomorphismes, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’ve googled How do I find out how big my workspace is too many times … here’s the explicit code to run and hopefully the next googler sees this post:

for (thing in ls()) { message(thing); print(object.size(get(thing)), units='auto') }

Fin. You can stop there.

 

Or for a bit of context… Here’s an example code to generate objects of variable sizes where you might not be sure how big they are:

require(boot)
require(datasets)
data(sunspot.year)
system.time(boot.1 <- boot( sunspot.year, max, R=1e3, parallel='multicore', ncpu=4))
system.time(boot.2 <- boot( sunspot.year, max, R=1e4))
system.time(boot.3 <- tsboot( sunspot.year, max, R=1e5, parallel='multicore', ncpu=4))
system.time(boot.4 <- boot( sunspot.year, max, R=1e5, parallel='multicore', ncpu=8))
system.time(boot.5 <- boot( sunspot.year, max, R=1e6 parallel='multicore', ncpu=8))
print(boot.1)
plot(boot.1)
par(col=rgb(0,0,0,.1), pch=20)
plot(boot.2)
for (thing in ls()) {
    message(thing)
    print(object.size(get(thing)), units='auto')
    }

This code is doing a few things:

  1. resampling the sunspot dataset to try to estimate the most sunspots we “should” see in a year (with a very stylised meaning of “should”).

    This is worth looking into because some people say global warming is caused by sunspots rather than eg carbon emissions multiplying greenhouse effects.

    History only happened once but by bootstrapping we try to overcome this.
    image
    image

  2. noodling around with multiple cores (my laptop has 8; sudo apt-get install lscpu). Nothing interesting happens in this case; still, multicore is an option.
  3. timing how long fake reprocessings of history take with various amounts of resampling and various numbers of cores
  4. showing how big those bootstrap objects are. Remember, R runs entirely in memory, so big datasets or derived objects of any kind can cramp your home system or bork your EC2.
  5. printing the size of the objects, as promised. On my system (which I didn’t run the exact code above) the output was:
    > for (obj in ls()) { message(obj); print(object.size(get(obj)), units='auto') }
    b.1
    89.1 Kb
    b.2
    792.2 Kb
    b.3
    7.6 Mb
    b.4
    7.6 Mb
    b.5
    7.6 Mb
    b.6
    792.2 Kb
    obj
    64 bytes
    sunspot.year
    2.5 Kb
    

image

PS To find out how much memory you have (in linux or maybe Mac also) do:

$ free -mt
             total       used       free     shared    buffers     cached
Mem:         15929      12901       3028          0        214       9585
-/+ buffers/cache:       3102      12827
Swap:        10123          0      10123
Total:       26053      12901      13152

To leave a comment for the author, please follow the link and comment on their blog: isomorphismes.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)