Daily Stock Gainers Automated Web Scraping in R with Github Actions

[This article was first published on r-bloggers on Programming with R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In this R tutorial, We’ll learn how to schedule an R script as a CRON Job using Github Actions. Thanks to Github Actions, You don’t need a dedicated server for this kind of automation and scheduled tasks. This example can be extended for Automated Tweets or Automated Social Media Posts, Daily Data Extraction of any sort.

In this example, We’re going to use a code to extract / scrape Nifty50 (Indian Stock Exchange Index) Top Gainers Daily and store it as a csv file which can be used for Data Analytics on those stocks.

Video Tutorial on Scheduling R Script using Github Actions

Please Subscribe to the channel for more Data Science (with R – also Python) videos

Github Actions with R

Github Actions which usually trigger a script based on event like PR, Issue Creation can be modified using its YAML to trigger a script on a schedule (CRON).

Here’s the main.yml file used for the Github Action.

name: nifty50scrape

# Controls when the action will run.
on:
  schedule:
    - cron:  '0 13 * * *'


jobs: 
  autoscrape:
    # The type of runner that the job will run on
    runs-on: macos-latest

    # Load repo and install R
    steps:
    - uses: actions/checkout@master
    - uses: r-lib/actions/setup-r@master

    # Set-up R
    - name: Install packages
      run: |
        R -e 'install.packages("tidyverse")'
        R -e 'install.packages("janitor")'
        R -e 'install.packages("rvest")'
    # Run R script
    - name: Scrape
      run: Rscript nifty50_scraping.R
      
 # Add new files in data folder, commit along with other modified files, push
    - name: Commit files
      run: |
        git config --local user.name actions-user
        git config --local user.email "[email protected]"
        git add data/*
        git commit -am "GH ACTION Headlines $(date)"
        git push origin main
      env:
        REPO_KEY: ${{secrets.GITHUB_TOKEN}}
        username: github-actions

Look at this repo for more details of the code used for Scraping – https://github.com/amrrs/scrape-automation

For more details on Github Actions for R Scripts, Refer this R OpenSci Book – https://ropenscilabs.github.io/actions_sandbox/

To leave a comment for the author, please follow the link and comment on their blog: r-bloggers on Programming with R.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)