(This article was first published on

It took me a while to figure this out, so I thought I'd share. I have a dataframe with millions of observations in it, and I want to estimate a density distribution, which is a memory intensive process. Running my kde2d function on the full dataframe throws and error -- R tries to allocate a vector that is gigabytes in size. A reasonable alternative is to run the function on a smaller subset**Getting Genetics Done**, and kindly contributed to R-bloggers)To

**leave a comment**for the author, please follow the link and comment on his blog:**Getting Genetics Done**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...