Running GitHub Actions Sequentially

[This article was first published on stevenmortimer.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

TL;DR: If you need sequential execution in GitHub Actions consider these solutions:

  • Sequential steps: Steps within a job are always executed sequentially!

  • Sequential jobs: Set max-parallel: 1 within the jobs.strategy element of the workflow.

  • Sequential workflows: Use a repository_dispatch API call at the end of the workflow to trigger the next workflow (code available in the An Example section below).


The issue

Ever since the generally available release of GitHub Actions in November (2019),
it seems like many R packages developed on GitHub have switched from Travis CI or
another continuous integration service to now using GitHub Actions (GHA). It seems
like a great service to have integrated as closely as possible to your codebase,
but the product is still under active development. There is a dedicated
tag in
the GitHub Support Community to ask questions and browse answers. However, in a
few different questions (here, here, and here) it seems like folks are still grappling
with GitHub’s design decision to execute all workflows and jobs in a repository
in parallel, myself included.

As background, it is important to note here the basic lingo of GitHub Actions. A
repository can contain one or more ‘workflows’ defined by YAML files located in
the .github/workflows folder in the top level of the repository. Each
‘workflow’ can contain one or more ‘jobs’ that execute a series of ‘steps’.
By default, all steps in a single job execute sequentially. If you’re trying to
limit the number of parallel ‘jobs’ then you you can set a limit of 1 for the
workflow by setting max-parallel: 1 within the jobs.strategy
element of the workflow YAML. However, the issue with multiple jobs in a single
workflow is that if one job fails out of 10 jobs in your workflow, then
you’ll have to re-run all 10 jobs for the workflow status to be a success. For
this reason, I have decided to split jobs across different workflows. This way,
I can re-run one individually, if needed. I will refer to ‘jobs’ and
‘workflows’ interchangeably throughout this article because you can have workflows
that execute one job and one job only.

Why do I need sequential workflows/jobs?

The reason I need this functionality is that my workflows interact with a 3rd-party
service (Salesforce) and one workflow might affect the results of another
workflow if accessing the service simultaneously. I also want to prevent other
workflows from executing if I can’t get the first one to succeed since the issue
could occur across all workflows. This allows me to reduce the total number
of API calls to Salesforce which are capped in a 24-hour period by catching issues
early before other workflows execute.


An Example

You can setup sequential workflows using a repository_dispatch action in 4
easy steps:

  1. Step 1 – Create a Personal Access Token (PAT)
  2. Step 2 – Add the PAT as an actions secret in the repository
  3. Step 3 – Add the repository_dispatch event to Workflow 1
  4. Step 4 – Add the repository_dispatch event as trigger in Workflow 2 YAML

For context, a required element in every workflow the name of the
GitHub event that triggers the workflow. For example, on: pull_request means
“execute this workflow every time a pull request is opened”. If you want to run
workflows sequentially, then you just need to issue a specific event type that
lets the next workflow know when to begin. You could try to write your own solution
that uses the GitHub Actions APIs to list the workflows, jobs, or check-suites
and find out which ones have failed or not, but the easy alternative is to use a
repository_dispatch event.

“You can use this endpoint to trigger a webhook event called repository_dispatch when you want activity that happens outside of GitHub to trigger a GitHub Actions workflow or GitHub App webhook.”

https://developer.github.com/v3/repos/#create-a-repository-dispatch-event

Workflows aren’t aware of other workflows, so this event webhook is perfect to trigger,
or “daisy-chain”, separate workflows. You could execute the event via a curl command
from the shell in a new job step, but I recommend using the Repository
Dispatch
action that Peter Evans has released in the GitHub Marketplace,
which makes it dirt simple to execute a repository dispatch event from a workflow.

Step 1 – Create a Personal Access Token (PAT)

Follow GitHub’s instructions here and
when it comes time to select the scopes, or permissions, you’d like to grant the
token then check "repo" if you’re on a private repository or "public_repo" if
you’re on a public one.

Step 2 – Add the PAT as an actions secret in the repository

Follow GitHub’s instructions here. I recommend naming the secret REPO_GHA_PAT.

Step 3 – Add the repository_dispatch event to Workflow 1

This step is where you update Workflow 1’s YAML file. For this example, consider
“Workflow 1” as the workflow, and the job(s) contained within it, as what you’d
like to execute first. Think of “Workflow 2” as the workflow you’d like to
execute after “Workflow 1”. Add the following as the last step in the workflow YAML
file:

- name: Trigger next workflow
  if: success()
  uses: peter-evans/[email protected]
  with:
    token: ${{ secrets.REPO_GHA_PAT }}
    repository: ${{ github.repository }}
    event-type: trigger-workflow-2
    client-payload: '{"ref": "${{ github.ref }}", "sha": "${{ github.sha }}"}'

In this example above, you’ll notice the line if: success(). This means that, only
if all the prior steps in the workflow were successful, we should run this step
that triggers Workflow 2. Also you should notice the line which passes data from
Workflow 1 to Workflow 2:

client-payload: '{"ref": "${{ github.ref }}", "sha": "${{ github.sha }}"}'

In this case it is telling Workflow 2 the branch and commit hash to checkout and
use so that we know Workflows 1 and 2 are using the same exact code. Remember,
it’s possible that you have a couple different Workflow 1’s running because
you’ve pushed code or triggered them in some way and you want to make sure each
one triggers the same code in Workflow 2.

Step 4 – Add the repository_dispatch event as trigger in Workflow 2 YAML

This step is where you update Workflow 2’s YAML file. First, add your event name as
the type of repository dispatch that should trigger Workflow 2. This name
must match exactly as what you specified the event-type we covered in Step 3
(the last step of Workflow 1). In our case we called the event to trigger Workflow 2
as event-type: trigger-workflow-2. This could be called anything you wish. The
important part is using the same name in the types key of the Workflow 2 YAML
file. It should be included within square brackets and without quotes as shown below:

name: Workflow 2

on:
  repository_dispatch:
    types: [trigger-workflow-2]

Second, use the client payload data from the event to checkout the same code. You
can so this by modifying the checkout step, usually one of the first step in your job.

    steps:
      - uses: actions/[email protected]
        with:
          ref: ${{ github.event.client_payload.sha }}
      
      ... (other steps)
      
      - uses: r-lib/actions/[email protected]
      
      - uses: r-lib/actions/[email protected]

That’s it! If you have more than 2 workflows, then simply add a unique trigger as
the last step in each workflow that calls the next. For the first workflow, I typically
trigger based on a push to a certain branch like this:

name: Workflow 1

on:
  push:
    branches: main


GitHub - salesforcer GHA workflows If you would like to see a complete example in action, then feel free to browse the .github/workflows folder
for the {salesforcer} package.


Considerations

If you need your jobs to execute sequentially but you want them to all still
run, even if some fail, then just change the if: statement mentioned above in Step 3
to if: always(). The only reason I did not do this is that even if the following
workflow is successful, it will get triggered again when we re-run the failed workflow,
which I didn’t want. In order to achieve that you may need to use some more
advanced tricks/hacks to only execute if the next workflow has the same ref
and sha and the latest run does not have a ‘completed’ status.

Another consideration is the cost to execute GitHub actions in private repositories.
It’s true that many projects will not likely need to enforce sequential workflows
because the tests, examples, checks or dependencies do not affect other
workflows. However, GitHub Actions is only free for public respositories. You
may have private repositories and want to limit the amount of processing time so
you can stay within the free tier (less than 2,000 minutes per month).
Sequential execution can prevent all the workflows from executing if upstream
workflows fail.


To leave a comment for the author, please follow the link and comment on their blog: stevenmortimer.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)