This will be a very short post for a line of code I’ve found unbelievably useful as I analyze data for work. I’m working with datasets containing millions of rows of data. (The most recent one I worked with had about 13 million records.) Because R loads datasets into memory, you can run out of RAM pretty quickly when working with data that large. As I start getting access to more services for databasing and cloud computing, I’m hoping to move some of that data out of my own memory, and onto something with more memory. But for now, I found this quick fix.
I increased my paging file (virtual memory) on my computer as high as it will let me, but R doesn’t automatically increase its memory limits. But a single line of code will do that for you.
Set that value to whatever your virtual memory is set for. (Note that this value is in MB.) Huge thanks for this Stack Overflow post that taught me how to do this.
Monday, I’ll talk about some functions that allow you more quickly read (and write) large files.