Capturing Screenshots Programmatically With R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As part of our work documenting R-Universe, we’re adding screenshots of the interface to the documentation website. Taking screenshots manually could quickly become very cumbersome, especially as we expect they’ll need updating in future: we might want to change the universes we feature, the interface might improve yet again and therefore look slightly different. Therefore, we decided to opt for a programmatic approach. In this post we shall present our learnings from using the R packages chromote and magick to produce screenshots.
Main packages needed
The chromote R package
To take a screenshot programmatically, we need to somehow get control of a browser from a script. The chromote R package is an actively maintained wrapper for Chrome Remote Interface, authored by Winston Chang and Barrett Schloerke. With chromote, you can open a browser in the background, navigate to the page of your choice, interact with it, and capture screenshots. chromote powers the experimental live web-scraping in the rvest package and is also used in shinytest2.
The magick R package
Generally, Jeroen Ooms’ magick R package isn’t required to take screenshots, but we wanted to format the images by adding shadows. Therefore this tool for image manipulation was necessary.
A first screenshot
We start by initiating a chromote session of a standard screen size. We found the default screen size too narrow for our needs.
library("chromote") screen_width <- 1920 screen_height <- 1080 b <- ChromoteSession$new(height = screen_height, width = screen_width)
We created a function that wraps b$screenshot()
while still exposing
the parameters we need to tweak for some of the pages.
The chromote package, like Shiny, uses R6.
Our function also calls the handy magick::image_shadow()
for adding a shadow to the image.
screenshot <- function(b, img_path, selector = "html", cliprect = c(top = 0, left = 0, width = 1920, height = 1080), expand = NULL) { b$screenshot( img_path, selector = selector, delay = 2, cliprect = cliprect, expand = expand ) magick::image_read(img_path) |> magick::image_shadow() |> magick::image_write(img_path, quality = 100) }
We can now capture the search interface of R-Universe.
In our script, we’ve added a generous Sys.sleep()
call to ensure pages are properly loaded.
b$Page$navigate("https://r-universe.dev/search/") Sys.sleep(1) screenshot(b, "search.png")
This example captures the screenshot below.
Screenshots after filling a text input area
We also wanted to demonstrate searching for something. We followed the example in a gist to enter text into the search navbar1:
- we needed to find the search input area, whose idea is “search-input”, by a call to
b$DOM$querySelector()
; - then we focussed on that area through
b$DOM$focus()
; - finally we added text with
b$Input$insertText()
.
It was easier to reload the page than to try and delete the current text before doing a new search.
screenshot_search <- function(query, screen_width) { b$Page$navigate("https://r-universe.dev/search/") Sys.sleep(2) search_box <- b$DOM$querySelector(b$DOM$getDocument()$root$nodeId, "#search-input") b$DOM$focus(search_box$nodeId) b$Input$insertText(text = query) Sys.sleep(2) screenshot( b, sprintf("search-%s.png", snakecase::to_lower_camel_case(query)) ) } purrr::walk( c('"missing-data"', "author:jeroen json", "exports:tojson"), screenshot_search, screen_width = screen_width )
Screenshots after clicking on something
The search interface of R-universe suggests some advanced research fields if one clicks on the arrow-down button near the search button. Through a GitHub search for code we found an example of clicking in the source code of rvest that we were able to adapt.
The code finds the arrow-down button through its class rather than id so it might be a bit brittle over time. Once the class is found, we retrieve its “box model” which is a bunch of coordinates, and then calculate the centre coordinates. After that, a “click” which is actually a three-step action: moving the mouse, pressing the mouse, releasing the mouse.
# Searching, advanced fields ---- b$Page$navigate("https://r-universe.dev/search/") Sys.sleep(1) search_info <- b$DOM$querySelector( b$DOM$getDocument()$root$nodeId, "button.btn.btn-outline-secondary.dropdown-toggle.dropdown-toggle-split" ) quads <- b$DOM$getBoxModel(search_info$nodeId) content_quad <- as.numeric(quads$model$content) center_x <- mean(content_quad[c(1, 3, 5, 7)]) center_y <- mean(content_quad[c(2, 4, 6, 8)]) b$Input$dispatchMouseEvent( type = "mouseMoved", x = center_x, y = center_y, ) b$Input$dispatchMouseEvent( type = "mousePressed", x = center_x, y = center_y, button = "left", clickCount = 1 ) b$Input$dispatchMouseEvent( type = "mouseReleased", x = center_x, y = center_y, button = "left", clickCount = 1 ) Sys.sleep(2) screenshot(b, "search-advanced.png")
Other gotchas about using chromote
The package pages on R-universe are quite exhaustive.
We needed to capture screenshots of different sections.
When using an URL with a fragment, for instance https://r-spatial.r-universe.dev/sf#citation
,
we got the exact same screenshot as for https://r-spatial.r-universe.dev/sf
.
We therefore had to use the cliprect
argument of the screenshot method.
Figuring out we needed it was a first gotcha.
A second gotcha is that cliprect
is “a (unnamed) vector/list with left, top, width and height”2:
and it’s the order, not the names, that matters.
Conclusion
In this post we explained how we used the chromote and magick R packages to produce screenshots for the documentation website of R-universe. Find the current version of our script. Do you use sometimes chromote or a similar programmatic browser interface?
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.