checkout action performing merge commit from incorrect base SHA

Created on 30 Aug 2019  Â·  15Comments  Â·  Source: actions/checkout

We're observing actions/checkout creating merge commits based on the repo's latest master SHA, rather than github.event.pull_request.base.sha from the event that initiated the action.

This causes different merge commits across jobs within a single workflow. This is despite GITHUB_SHA, github.sha, github.event.pull_request.head.sha, and github.event.pull_request.base.sha being identical for both jobs.

This happens because a new commit was added to master between the two job runs, but the observed behavior is still surprising given that both jobs are given identical environment variables and Github event data.

Identical environment between jobs

GITHUB_REF=refs/pull/14/merge
GITHUB_SHA=aa0ad9298d3b4e43eb7f56bdb33af0609193dba7

Abridged ${{ github }} context:

GITHUB_CONTEXT: {
  "ref": "refs/pull/14/merge",
  "sha": "aa0ad9298d3b4e43eb7f56bdb33af0609193dba7",
  "head_ref": "siggy/workflow-testing",
  "base_ref": "master",
  "event": {
    "pull_request": {
      "base": {
        "ref": "master",
        "sha": "891c8c550cf9f3890c612e4ef5ba77fbc93ec642",
      },
      "head": {
        "ref": "siggy/workflow-testing",
        "sha": "74460e62efc34fe80862f684dad06f41f55dacc1",
      },
    }
  },
}

Job 1

git checkout --progress --force refs/remotes/pull/14/merge
...
HEAD is now at aa0ad929 Merge 74460e62efc34fe80862f684dad06f41f55dacc1 into 891c8c550cf9f3890c612e4ef5ba77fbc93ec642

Full output:
https://github.com/siggy/linkerd2/runs/207435287#step:2:1007

Job 2

git checkout --progress --force refs/remotes/pull/14/merge
...
HEAD is now at 7efd2d0b Merge 74460e62efc34fe80862f684dad06f41f55dacc1 into 324483a653c7c09a350bc2a782080d6ea0ae533d

Note that Job 2 has created merge commit 7efd2d0b based off of the most recent master commit 324483a653c7c09a350bc2a782080d6ea0ae533d, despite these SHAs not appearing anywhere in the environment variables or event data.

Full output:
https://github.com/siggy/linkerd2/runs/207437496#step:2:1014

State of master

Note that the above event data references the 2nd most recent commit to master, as that was the state of master when the workflow was triggered:

$ git log --pretty=oneline | head -n2
324483a653c7c09a350bc2a782080d6ea0ae533d master branch testing
891c8c550cf9f3890c612e4ef5ba77fbc93ec642 Merge remote-tracking branch 'upstream/master'

with/ref config

Note that setting ref: ${{ github.sha }} or ref: ${{ github.ref }} had no effect:

sha

https://github.com/siggy/linkerd2/pull/14/checks?check_run_id=207481400#step:2:3

- name: Checkout code
  uses: actions/checkout@v1
  with:
    ref: ${{ github.sha }}
Run actions/checkout@v1
  with:
    ref: 17c77866218c23d4b2a47221ccd9aff78a5d7172
    clean: true
...
git checkout --progress --force refs/remotes/pull/14/merge
...
HEAD is now at d6ae7796 Merge 80e75c911dbd20e9b1226d7854818843b037dc1a into ae31e8838e171e60e1cd2fa9ad54070fcb741025

ref

https://github.com/siggy/linkerd2/pull/14/checks?check_run_id=207488293#step:2:3

- name: Checkout code
  uses: actions/checkout@v1
  with:
    ref: ${{ github.ref }}
Run actions/checkout@v1
  with:
    ref: refs/pull/14/merge
    clean: true
...
git checkout --progress --force refs/remotes/pull/14/merge
...
HEAD is now at eb786159 Merge 8a33bbfb6ad62902926b1449c2b9703433da6450 into ae31e8838e171e60e1cd2fa9ad54070fcb741025

Previously reported in the Community Forum

https://github.community/t5/GitHub-API-Development-and/Github-Actions-Inconsistent-repo-checkouts-across-jobs/td-p/30258

...but upon further inspection of workflow environment variables opted to create an issue in this repo.

Most helpful comment

We're observing actions/checkout creating merge commits based on the repo's latest master SHA, rather than github.event.pull_request.base.sha from the event that initiated the action.

and:

Is it correct that refs/remotes/pull/15/merge is a merge commit from the latest master, not from master when the PR event triggered the job?

Yes, this is the intended default behavior. The goal is to validate that the pull request will build and test against what it would be merged into. Continuous integration builds need to take into account what they'll be merge into, not the state of the repository when they were created. This prevents you from merging a commit that breaks master but "worked on my machine".

If your goal is not to validate the CI but to do some fixups (ie, automatic updates from linting) then I agree that you would not want to check out the merge commit but to actually

If you really want to validate the pull request as it was actually sent, and not what would be produced by a merge into master (ie, in isolation of the master branch), you can specify the ref to checkout:

steps:
- uses: actions/checkout@v1
  with:
    ref: ${{ github.head_ref }} 

However, I _strongly_ encourage you not to make this a separate workflow - one workflow that lints and updates the PR if (and only if) it made some changes, and then a second workflow that does a build and test on the merge produced into master for verification.

All 15 comments

@siggy This is kind of worrying behavior. Does using command line git avoid this issue?

I think the second job should failed if your PR branch get updated, since we can't find the same SHA to build anymore.

@chingc @TingluoHuang Thanks for looking into this. I have set up a simpler, more contrived example:

https://github.com/siggy/linkerd2/pull/15/checks?check_run_id=220501197
https://github.com/siggy/linkerd2/blob/a2583cac37a6c958dfc9bf5ae2075e6af7c1cf0b/.github/workflows/workflow.yml

I am not reliably reproducing the issue in the above example, but I do I see commands from actions/checkout that I think may be causing this:

git -c http.extraheader="AUTHORIZATION: basic ***" fetch --tags --prune --progress --no-recurse-submodules origin +refs/heads/*:refs/remotes/origin/* +refs/pull/15/merge:refs/remotes/pull/15/merge
git checkout --progress --force refs/remotes/pull/15/merge

Is it correct that refs/remotes/pull/15/merge is a merge commit from the latest master, not from master when the PR event triggered the job?

We're also running into this issue. -- Basically under the hood, Github Actions/Checkout is checking out this phantom Merge Commit, when I look at other environment variables like GITHUB_SHA they actually point to this fake commit -- and the commit claims to be from me! (but it's not signed/verified like my other commits are).

This is particularly problematic, because we have some automatic commits like Prettier Formatting, and generating some documentation when certain code files change, and these automatic commits are effectively doing a merge-master-into-branch every time they trigger!

Screenshot of attempting to work around it

Each of those Merge SHA into SHA commits are generated by Github and end up being the context under which the action is Running.

This is also a problem as we have an external CI, that we trigger from Github Actions (based on labels), and we try to prevent duplicate builds for a given Commit, but since this Phantom Commit is the GITHUB_SHA we work with, it never finds a build for that commit, so we're wasting CI cycles building the app more than a few times.

We're observing actions/checkout creating merge commits based on the repo's latest master SHA, rather than github.event.pull_request.base.sha from the event that initiated the action.

and:

Is it correct that refs/remotes/pull/15/merge is a merge commit from the latest master, not from master when the PR event triggered the job?

Yes, this is the intended default behavior. The goal is to validate that the pull request will build and test against what it would be merged into. Continuous integration builds need to take into account what they'll be merge into, not the state of the repository when they were created. This prevents you from merging a commit that breaks master but "worked on my machine".

If your goal is not to validate the CI but to do some fixups (ie, automatic updates from linting) then I agree that you would not want to check out the merge commit but to actually

If you really want to validate the pull request as it was actually sent, and not what would be produced by a merge into master (ie, in isolation of the master branch), you can specify the ref to checkout:

steps:
- uses: actions/checkout@v1
  with:
    ref: ${{ github.head_ref }} 

However, I _strongly_ encourage you not to make this a separate workflow - one workflow that lints and updates the PR if (and only if) it made some changes, and then a second workflow that does a build and test on the merge produced into master for verification.

Oh wow! That’s super interesting. And very much not what I was guessing or expecting.

What happens if there are merge conflicts?
What happens if master has changes to a dependency that we rely on? It’s common in the react* to use snapshot tests, and these currently would fail our PR every time we push a commit when the PR is not up to date with master. It seems like this expects a continuous rebasing against master? Even then… ouch?

steps:
- uses: actions/checkout@v1
  with:
    ref: ${{ github.head_ref }} 

This is exactly what I've had to resort to when running some ci checks that report status based on branch. Otherwise the reported branch is incorrect.

@ethomson Appreciate the detailed reply. Testing a PR merged into master is in fact what we want, not github.head_ref.

Our core issue is that our workflow that executes actions/checkout across multiple jobs, and if master changes between those jobs, actions/checkout yields different copies of the repo.

This is problematic because, for example, a job in our workflow builds docker images versioned as docker-image:git-sha-foo, and then a subsequent job tries to deploy docker-image:git-sha-bar.

The result is every time we merge master, most currently-running CI workflows fail. Here's an example, note actions/checkout yielding different repo SHAs within a single CI workflow:
https://github.com/linkerd/linkerd2/runs/238047516#step:2:1142
https://github.com/linkerd/linkerd2/runs/238049216#step:2:1142

Is there any plan to open source this action? We'd love to just submit a PR to better illustrate the issue. Failing that, we may write our own checkout action, but we're concerned about potential rate limit issues.

Any guidance is much appreciated, thanks.

Hi @fbartho -

What happens if there are merge conflicts?

The first check that happens on GitHub when you open a pull request is whether it's mergeable or not.

What happens if master has changes to a dependency that we rely on? It’s common in the react* to use snapshot tests, and these currently would fail our PR every time we push a commit when the PR is not up to date with master. It seems like this expects a continuous rebasing against master? Even then… ouch?

I'd like to understand more about the scenario you're describing, but generally this is an advantage. You usually _want_ to know if there was a change to a dependency in master that you depend on in your pull request. Consider the case where the dependency was changed in master in an incompatible way and that PR used it. If the CI build _didn't_ do the merge into master, then that would just be a successful build, but as soon as you merged the pull request, that would break.

By building the merge commit, you're able to have high confidence that the integration of your pull request will be successful.

Our core issue is that our workflow that executes actions/checkout across multiple jobs, and if master changes between those jobs, actions/checkout yields different copies of the repo.

Thanks, @siggy, for the clarification. We'll give this some thought.

@ethomson FWIW, we have worked around this issue by only using actions/checkout in the first job of our workflow, and then saving that copy of the repo as an artifact for all subsequent jobs: https://github.com/linkerd/linkerd2/pull/3602

@siggy actions/checkout@v2 fixes the race condition. For PRs, the individual SHA is now fetched.

@ericsciple Thanks! Will check it out.

I've just run into what I believe is a related issue.
Yesterday, I open a PR, and CI runs.
Some time later that day, there's an unrelated commit pushed to master.
Today, I push a new commit to my PR.
In this PR run ${{github.event.pull_request.base.sha}} is set to to SHA1 of master yesterday when the PR was opened, whereas I expected it to be set to the SHA1 of master today.
The parent commits of refs/pull/1234/merge are the head of my branch and the SHA1 of master _today_.
I expected ${{github.event.pull_request.base.sha}} to be one of the parent commits of refs/pull/1234/merge, but it's not.
I noticed this because I use actions/checkout with fetch-depth: 2, which fetches the merge commit and its two parent commits: the head of the branch being tested, and the head of the base branch master. I expected ${{github.event.pull_request.base.sha}} to be in the list of revisions fetched by actions/checkout, but it's not.
That causes this failure:

$ git diff --name-only ${{github.event.pull_request.base.sha}}...
fatal: Invalid symmetric difference expression ddf331629b3147875282f18f52fc6f3483d75dff...

This repo is private. For GitHub staff who may investigate:
Yesterday's CI run (which succeeded) is 624420073
Today's CI run (which failed) is 627096250

The workaround of instead using git diff --name-only 'HEAD~1' looks like it'll work.

Was this page helpful?
0 / 5 - 0 ratings