Workshop in Cape Town: Web Scraping with R

Posted on May 22, 2018 by Digital Age Economist on Digital Age Economist in R bloggers | 0 Comments

[This article was first published on Digital Age Economist on Digital Age Economist, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Join Andrew Collier and Hanjo Odendaal for a workshop on using R for Web Scraping.

Who should attend?

This workshop is aimed at beginner and intermediate R users who want to learn more about using R for data acquisition and management, with a specific focus on web scraping.

What will you learn?

You will learn:

data manipulation with dplyr, tidyr and purrr;
tools for accessing the DOM;
scraping static sites with rvest;
scraping dynamic sites with RSelenium; and
setting up an automated scraper in the cloud.

See programme below for further details.

Where	–	Rise, Floor 5, Woodstock Exchange, 66 Albert Road, Woodstock, Cape Town
When	–	14-15 June 2018
Who	–	Andrew Collier Hanjo Odendaal

There are just 20 seats available. A 10% discount is available for groups of 4 or more people from a single organisation attending both days.

Email [email protected] if you have any questions about the workshop.

Programme

Day 1

Motivating Example
R and the tidyverse
- Vectors, Lists and Data Frames
- Loading data from a file
- Manipulating Data Frames with dplyr
- Pivoting with tidyr
- Functional programming with purrr
Introduction to scraping
- Ethics
- DOM
- Developer Tools
- CSS and XPath
- robots.txt and site map
Scraping a static site with rvest
- What happens under the hood
- What the hell is curl?
- Assisted Assignment: Movie information from IMDB

Day 2

Case Study: Investigating drug tests using rvest
Interacting with APIs
- Using XHR to find an API
- Building wrappers around APIs
Scraping a dynamic site with RSelenium
- Why RSelenium is needed
- Navigation around web-pages
- Combining RSelenium with rvest
- Useful JavaScript tools
- Case Study
Deploying a Scraper in the Cloud
- Launching and connecting to an EC2 instance
- Headless browsers
- Automation with cron

To leave a comment for the author, please follow the link and comment on their blog: Digital Age Economist on Digital Age Economist.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Workshop in Cape Town: Web Scraping with R

Who should attend?

What will you learn?

Programme

Day 1

Day 2

Related

Who should attend?

What will you learn?

Programme

Day 1

Day 2

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)