Running GitHub Actions Sequentially

[This article was first published on stevenmortimer.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

TL;DR: If you need sequential execution in GitHub Actions consider these solutions:

  • Sequential steps: Steps within a job are always executed sequentially!

  • Sequential jobs: Set max-parallel: 1 within the jobs.strategy element of the workflow.

  • Sequential workflows: Use a repository_dispatch API call at the end of the workflow to trigger the next workflow (code available in the An Example section below).


The issue

Ever since the generally available release of GitHub Actions in November (2019), it seems like many R packages developed on GitHub have switched from Travis CI or another continuous integration service to now using GitHub Actions (GHA). It seems like a great service to have integrated as closely as possible to your codebase, but the product is still under active development. There is a dedicated tag in the GitHub Support Community to ask questions and browse answers. However, in a few different questions (here, here, and here) it seems like folks are still grappling with GitHub’s design decision to execute all workflows and jobs in a repository in parallel, myself included.

As background, it is important to note here the basic lingo of GitHub Actions. A repository can contain one or more ‘workflows’ defined by YAML files located in the .github/workflows folder in the top level of the repository. Each ‘workflow’ can contain one or more ‘jobs’ that execute a series of ‘steps’. By default, all steps in a single job execute sequentially. If you’re trying to limit the number of parallel ‘jobs’ then you you can set a limit of 1 for the workflow by setting max-parallel: 1 within the jobs.strategy element of the workflow YAML. However, the issue with multiple jobs in a single workflow is that if one job fails out of 10 jobs in your workflow, then you’ll have to re-run all 10 jobs for the workflow status to be a success. For this reason, I have decided to split jobs across different workflows. This way, I can re-run one individually, if needed. I will refer to ‘jobs’ and ‘workflows’ interchangeably throughout this article because you can have workflows that execute one job and one job only.

Why do I need sequential workflows/jobs?

The reason I need this functionality is that my workflows interact with a 3rd-party service (Salesforce) and one workflow might affect the results of another workflow if accessing the service simultaneously. I also want to prevent other workflows from executing if I can’t get the first one to succeed since the issue could occur across all workflows. This allows me to reduce the total number of API calls to Salesforce which are capped in a 24-hour period by catching issues early before other workflows execute.


An Example

You can setup sequential workflows using a repository_dispatch action in 4 easy steps:

  1. Step 1 – Create a Personal Access Token (PAT)
  2. Step 2 – Add the PAT as an actions secret in the repository
  3. Step 3 – Add the repository_dispatch event to Workflow 1
  4. Step 4 – Add the repository_dispatch event as trigger in Workflow 2 YAML

For context, a required element in every workflow the name of the GitHub event that triggers the workflow. For example, on: pull_request means “execute this workflow every time a pull request is opened”. If you want to run workflows sequentially, then you just need to issue a specific event type that lets the next workflow know when to begin. You could try to write your own solution that uses the GitHub Actions APIs to list the workflows, jobs, or check-suites and find out which ones have failed or not, but the easy alternative is to use a repository_dispatch event.

“You can use this endpoint to trigger a webhook event called repository_dispatch when you want activity that happens outside of GitHub to trigger a GitHub Actions workflow or GitHub App webhook.”

https://developer.github.com/v3/repos/#create-a-repository-dispatch-event

Workflows aren’t aware of other workflows, so this event webhook is perfect to trigger, or “daisy-chain”, separate workflows. You could execute the event via a curl command from the shell in a new job step, but I recommend using the Repository Dispatch action that Peter Evans has released in the GitHub Marketplace, which makes it dirt simple to execute a repository dispatch event from a workflow.

Step 1 – Create a Personal Access Token (PAT)

Follow GitHub’s instructions here and when it comes time to select the scopes, or permissions, you’d like to grant the token then check "repo" if you’re on a private repository or "public_repo" if you’re on a public one.

Step 2 – Add the PAT as an actions secret in the repository

Follow GitHub’s instructions here. I recommend naming the secret REPO_GHA_PAT.

Step 3 – Add the repository_dispatch event to Workflow 1

This step is where you update Workflow 1’s YAML file. For this example, consider “Workflow 1” as the workflow, and the job(s) contained within it, as what you’d like to execute first. Think of “Workflow 2” as the workflow you’d like to execute after “Workflow 1”. Add the following as the last step in the workflow YAML file:

- name: Trigger next workflow
  if: success()
  uses: peter-evans/repository-dispatch@v1
  with:
    token: ${{ secrets.REPO_GHA_PAT }}
    repository: ${{ github.repository }}
    event-type: trigger-workflow-2
    client-payload: '{"ref": "${{ github.ref }}", "sha": "${{ github.sha }}"}'

In this example above, you’ll notice the line if: success(). This means that, only if all the prior steps in the workflow were successful, we should run this step that triggers Workflow 2. Also you should notice the line which passes data from Workflow 1 to Workflow 2:

client-payload: '{"ref": "${{ github.ref }}", "sha": "${{ github.sha }}"}'

In this case it is telling Workflow 2 the branch and commit hash to checkout and use so that we know Workflows 1 and 2 are using the same exact code. Remember, it’s possible that you have a couple different Workflow 1’s running because you’ve pushed code or triggered them in some way and you want to make sure each one triggers the same code in Workflow 2.

Step 4 – Add the repository_dispatch event as trigger in Workflow 2 YAML

This step is where you update Workflow 2’s YAML file. First, add your event name as the type of repository dispatch that should trigger Workflow 2. This name must match exactly as what you specified the event-type we covered in Step 3 (the last step of Workflow 1). In our case we called the event to trigger Workflow 2 as event-type: trigger-workflow-2. This could be called anything you wish. The important part is using the same name in the types key of the Workflow 2 YAML file. It should be included within square brackets and without quotes as shown below:

name: Workflow 2

on:
  repository_dispatch:
    types: [trigger-workflow-2]

Second, use the client payload data from the event to checkout the same code. You can so this by modifying the checkout step, usually one of the first step in your job.

    steps:
      - uses: actions/checkout@v2
        with:
          ref: ${{ github.event.client_payload.sha }}
      
      ... (other steps)
      
      - uses: r-lib/actions/setup-r@master
      
      - uses: r-lib/actions/setup-pandoc@master

That’s it! If you have more than 2 workflows, then simply add a unique trigger as the last step in each workflow that calls the next. For the first workflow, I typically trigger based on a push to a certain branch like this:

name: Workflow 1

on:
  push:
    branches: main


GitHub - salesforcer GHA workflows If you would like to see a complete example in action, then feel free to browse the .github/workflows folder for the {salesforcer} package.


Considerations

If you need your jobs to execute sequentially but you want them to all still run, even if some fail, then just change the if: statement mentioned above in Step 3 to if: always(). The only reason I did not do this is that even if the following workflow is successful, it will get triggered again when we re-run the failed workflow, which I didn’t want. In order to achieve that you may need to use some more advanced tricks/hacks to only execute if the next workflow has the same ref and sha and the latest run does not have a ‘completed’ status.

Another consideration is the cost to execute GitHub actions in private repositories. It’s true that many projects will not likely need to enforce sequential workflows because the tests, examples, checks or dependencies do not affect other workflows. However, GitHub Actions is only free for public respositories. You may have private repositories and want to limit the amount of processing time so you can stay within the free tier (less than 2,000 minutes per month). Sequential execution can prevent all the workflows from executing if upstream workflows fail.


To leave a comment for the author, please follow the link and comment on their blog: stevenmortimer.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)