htmlunitjars Updated to 2.34.0

February 28, 2019

[This article was first published on R –, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The in-dev htmlunit package for javascript-“enabled” web-scraping without the need for Selenium, Splash or headless Chrome relies on the HtmlUnit library and said library just released version 2.34.0 with a wide array of changes that should make it possible to scrape more gnarly javascript-“enabled” sites. The Chrome emulation is now also on-par with Chrome 72 series (my Chrome beta is at 73.0.3683.56 so it’s super close to very current).

In reality, the update was to the htmlunitjars package where the main project JAR and dependent JARs all received a refresh.

The README and tests were all re-run on both packages and Travis is happy.

If you’ve got a working rJava installation (aye, it’s 2019 and that’s still “a thing”) then you can just do:

install.packages(c("htmlunitjars", "htmlunit"), repos = "")

to get them installed and start playing with the DSL or work directly with the Java classes.


As usual, use your preferred social coding site to log feature requests or problems.

To leave a comment for the author, please follow the link and comment on their blog: R – offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)