Relenium, Selenium for R. A new tool for webscraping.

January 4, 2014
By

(This article was first published on RUG Barcelona » Rbloggers, and kindly contributed to R-bloggers)

 

Two members of the RugBcn  have developed a package for R that ease the path for webscraping . Among the current packages, we highlight the well known RCurl and XML packages. Both are enough for most situations, but they have a limitation dealing with situations where there is some javascript between the user and the information. For instance when the only way of getting to the desired page is by means of clicking buttons, selecting in menus, ….

Relenium has imported the java module Selenium (implemented in many languages, though) which has been traditionally used for web testing, via the package rJava. Its use is very intuitive, since reproduces the actions that a human would perform on a web page. The webpage of the project can be found here. There is an example explaining in detail how to use it. The package is still in development, so any comments/suggestions are welcome.

We hope you enjoy it.

Lluis Ramon and Aleix Ruiz de Villa,

RugBcn (Barcelona R Users Group)

To leave a comment for the author, please follow the link and comment on their blog: RUG Barcelona » Rbloggers.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...



If you got this far, why not subscribe for updates from the site? Choose your flavor: e-mail, twitter, RSS, or facebook...

Comments are closed.

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)