Automatically rerun your GitHub workflow after failure

Sometimes your GitHub Actions workflow is moody and fails for random reasons. Let's see how you can automatically rerun it to save you time.
Gao
GaoFounder
July 05, 20244 min read
Automatically rerun your GitHub workflow after failure

Introduction

We love GitHub Actions. It is powerful and offers a generous free tier.

As our project grows, we have added more and more integration tests to run in our GitHub Actions workflow. Like other cool projects, we run our tests for every pull request, push, and release.

However, due to the nature of integration tests, sometimes our workflow fails for random reasons. It could be a network issue, a browser hiccup, or other reasons that are hard to reproduce.

When the workflow fails, we have to manually rerun it. It is not a big deal in the beginning, but it's still annoying for high-efficiency developers like us.

How it works

At the time of writing, GitHub Actions does not provide a built-in way to automatically rerun a failed workflow. So let's break it down and see how we can achieve this.

When a workflow is NOT running and has failed, you can manually rerun it by clicking the "Re-run jobs" button. It will open a dropdown menu, and has a nice "Re-run failed jobs" option to skip the successful jobs.

So the idea is simple:

  1. We need to detect when a workflow has failed.
  2. We need to trigger the rerun action AFTER the workflow concludes.

Apparently, we can't do this in the same workflow because a workflow can't do anything when it's not running. So we need another workflow to monitor the status of the previous workflow and trigger the rerun action. This can be illustrated as follows:

Rerun workflow
Original workflow
Yes
No
Yes
Original workflow running?
Trigger original workflow
Hold on
Trigger the rerun workflow
Original workflow failed?

Update your workflows

Create a new workflow file in your repository, e.g., .github/workflows/rerun.yml:

# .github/workflows/rerun.yml

name: Rerun workflow

on:
  workflow_dispatch:
    inputs:
      run_id:
        required: true
jobs:
  rerun:
    runs-on: ubuntu-latest
    steps:
      - name: rerun ${{ inputs.run_id }}
        env:
          GH_REPO: ${{ github.repository }}
          GH_TOKEN: ${{ github.token }}
          GH_DEBUG: api
        run: |
          gh run watch ${{ inputs.run_id }} > /dev/null 2>&1
          gh run rerun ${{ inputs.run_id }} --failed

Thanks to the GitHub CLI, we can easily watch the status of a workflow and rerun it with two simple commands:

  • gh run watch to watch the status of a workflow (docs here).
  • gh run rerun to rerun a workflow. The --failed flag will only rerun the failed jobs (docs here).

All you need to do is to provide the run_id of the failed workflow which can be done by using the github.run_id context. For example, you can trigger the rerun workflow by appending the following step to your original workflow:

# .github/workflows/my-awesome-workflow.yml

jobs:
  # ...

  rerun-on-failure:
    needs: my-awesome-job # Replace with your job name
    if: failure() && fromJSON(github.run_attempt) < 3
    runs-on: ubuntu-latest
    steps:
      - env:
          GH_REPO: ${{ github.repository }}
          GH_TOKEN: ${{ github.token }}
          GH_DEBUG: api
        run: gh workflow run rerun.yml -F run_id=${{ github.run_id }}

The if: failure() condition will only trigger the step when the job has failed. The fromJSON(github.run_attempt) < 3 condition will only rerun the first 2 attempts. You can adjust the number according to your needs.

This step uses the gh workflow run command to trigger the rerun workflow. See the documentation for more details.

Bonus: Run the rerun workflow from the branch that failed

If you want to rerun the rerun workflow from the branch that failed, instead of the default branch, you can use the following command:

gh workflow run rerun.yml -r ${{ github.head_ref || github.ref_name }} -F run_id=${{ github.run_id }}

The github.head_ref context will be used if it's a pull request, otherwise, github.ref_name will be used.

Until your default branch has the rerun workflow, the above command will not work in non-default branches.

Credits

The idea of this approach is originally from this GitHub discussion comment.