Workshop in Cape Town: Web Scraping with R
[This article was first published on   Digital Age Economist on Digital Age Economist, and kindly contributed to R-bloggers].  (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
            
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Join Andrew Collier and Hanjo Odendaal for a workshop on using R for Web Scraping.
Who should attend?
This workshop is aimed at beginner and intermediate R users who want to learn more about using R for data acquisition and management, with a specific focus on web scraping.
What will you learn?
You will learn:
- data manipulation with 
dplyr,tidyrandpurrr; - tools for accessing the DOM;
 - scraping static sites with 
rvest; - scraping dynamic sites with 
RSelenium; and - setting up an automated scraper in the cloud.
 
See programme below for further details.
| Where | – | Rise, Floor 5, Woodstock Exchange, 66 Albert Road, Woodstock, Cape Town | 
|---|---|---|
| When | – | 14-15 June 2018 | 
| Who | – | 
            Andrew Collier Hanjo Odendaal  | 
    
There are just 20 seats available. A 10% discount is available for groups of 4 or more people from a single organisation attending both days.
Email [email protected] if you have any questions about the workshop.
Programme
Day 1
- Motivating Example
 - R and the tidyverse
- Vectors, Lists and Data Frames
 - Loading data from a file
 - Manipulating Data Frames with 
dplyr - Pivoting with 
tidyr - Functional programming with 
purrr 
 - Introduction to scraping
- Ethics
 - DOM
 - Developer Tools
 - CSS and XPath
 robots.txtand site map
 - Scraping a static site with 
rvest- What happens under the hood
 - What the hell is 
curl? - Assisted Assignment: Movie information from IMDB
 
 
Day 2
- Case Study: Investigating drug tests using 
rvest - Interacting with APIs
- Using XHR to find an API
 - Building wrappers around APIs
 
 - Scraping a dynamic site with 
RSelenium- Why 
RSeleniumis needed - Navigation around web-pages
 - Combining 
RSeleniumwithrvest - Useful JavaScript tools
 - Case Study
 
 - Why 
 - Deploying a Scraper in the Cloud
- Launching and connecting to an EC2 instance
 - Headless browsers
 - Automation with cron
 
 
To leave a comment for the author, please follow the link and comment on their blog:  Digital Age Economist on Digital Age Economist.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.