RSelenium and Java Heap Space

[This article was first published on R on datawookie, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I’m in the process of deploying a scraper on a DigitalOcean instance. The scraper uses RSelenium with the PhantomJS browser. I ran into a problem though. Although it worked flawlessly on my local machine, on the remote instance it broke with the following error:

Selenium message:Java heap space

Error:   Summary: UnknownError
   Detail: An unknown server-side error occurred while processing the command.
   class: java.lang.OutOfMemoryError
   Further Details: run errorDetails method
Execution halted

Clearly Java a memory issue.

Since the Selenium server is being launched from within R, I did not have direct access to the java command line options. However, setting an environment variable to increase the heap space resolved the problem.

$ export _JAVA_OPTIONS="-Xmx1g"

The scraper is now chugging along happily and I’m moving on with my day.

To leave a comment for the author, please follow the link and comment on their blog: R on datawookie.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)