Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Because of delays with my scholarship payment, if this post is useful to you I kindly ask a minimal donation on Buy Me a Coffee. It shall be used to continue my Open Source efforts. The full explanation is here: A Personal Message from an Open Source Contributor.
You can send me questions for the blog using this form and subscribe to receive an email when there is a new post.
< section id="motivation" class="level2">Motivation
I got this question: I followed your Selenium post and it does not work on Windows. How can I fix that?
The post in question is here, and after testing on a Windows machine I realised that the issue was related to fact that newer Google Chrome versions (>119) do not provide ChromeDriver, a software that Selenium uses to control the browser, and do not work with the most recent version you can download from Google.
Here is how to use Mozilla Firefox instead.
< section id="required-software" class="level2">Required software
- Mozilla Firefox and GeckoDriver: web browser and remote control program
- RSelenium: R-Selenium integration
- rvest: HTML processing
- dplyr: to load the pipe operator (can be used later for data cleaning)
- purrr: iteration (i.e., repeated operations)
I installed Mozilla Firefox from the official website and followed the installer.
For GeckoDriver, I downloaded it from here for Windows 64-bit and saved “geckodriver.exe” to a new folder “C:”. Then, I had to add the folder to the PATH like this:
- Press Win + S
- Type “Environment variables”
- Open “Edit the system environment variables”.
- Click “Environment variables”.
- In “System variables”, find and select “Path”, then click “Edit”.
- Click “New” and add “C:” without quotes
- Click OK to save.
Then restart RStudio and close PowerShell if it is open. Not installing GeckoDrive would only result in this error message in R: “Unable to create new service geckodriverservice.”
I installed RSelenium from the R console:
if (!require(RSelenium)) install.packages("RSelenium") # or remotes::install_github("ropensci/RSelenium")
For the rest of the packages:
if (!require(rvest)) install.packages("rvest") if (!require(dplyr)) install.packages("dplyr") if (!require(purrr)) install.packages("purrr")< section id="running-selenium-server" class="level2">
Running Selenium Server
I tried to start Selenium as it is mentioned in the official guide, and in the post linked above, and it did not work.
I also had to download Selenium Server, so I used this link and from a new PowerShell I ran:
cd Downloads java -jar selenium-server-standalone-3.9.1.jar
From RStudio (same for an R terminal), I could control the browser from R:
library(RSelenium) library(rvest) library(dplyr) library(purrr) rmDr <- remoteDriver(port = 4444L, browserName = "firefox") rmDr$open(silent = TRUE) url <- "https://pacha.dev/blog" rmDr$navigate(url)
This should display a new Firefox window and show my blog. The rest of the steps are the same as the previous post.
I hope this is useful 🙂
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.