Brew: Implement autosquash for homebrew-core pull requests

Created on 26 Jul 2020  Ā·  19Comments  Ā·  Source: Homebrew/brew

The motivation for the feature

Contributors are often confused by maintainer requests to reword, squash, or rebase their commits, as it's possible in the GitHub UI to do all of these things while merging a pull request, but our workflow does not permit core maintainers to use the GitHub UI for this purpose.

If the contributor doesn't know how to squash and rebase, the maintainer will have to run brew pr-pull locally and do the rebase themselves. This takes up valuable maintainer time, as they not only have to muck around with modifying Git history, in the case of large bottles, they will have to use their personal Internet connection to upload to Bintray rather than relying on the direct transfer from GitHub Actions.

If the contributor does know how to squash and rebase, the force-push back to the pull request branch will run our continuous integration again, taking up valuable time on our runners that could have been saved.

Finally, maintainers may sometimes accidentally merge pull requests that do not conform to our commit message guidelines. This makes it harder to understand git history.

A detailed description of the proposed feature

  • BrewTestBot should be able to identify pull requests where the commits do not follow our commit message standards. This would take the form of an additional GitHub Actions check, perhaps as a new brew test-bot option, which flags that the pull request contains commits not following style guidelines. These checks would ensure:

    • One commit per formula
    • One formulae per commit
    • New formula look like foo 1.0 (new formula)
    • New versions look like foo 2.0
    • Revision bumps look like foo: revision...
    • All other formula changes look like foo: ...
  • BrewTestBot should, when merging formulae, be able to automatically squash and rebase the commits to conform to our style. This could start off as relatively simple cases, such as pull requests that change only a single formula, and gradually evolve to more complex cases with e.g. multiple version changes and revision bumps.

How the feature would be relevant to at least 90% of Homebrew users

Saving maintainer time means more time for maintainers to work on other things. Saving contributor time and preventing confusion means it's easier for the Homebrew community to start and continue contributing to Homebrew, ensuring the long-term health of the project.

What alternatives to the feature have been considered

Maintainers continue to use brew pr-pull to squash and rebase commits themselves, or ask contributors to squash and force push their pull requests.

cc @Homebrew/core for their input on how this could affect or improve their workflow.

discussion help wanted in progress

Most helpful comment

Inspired by what @claui just said: I would say that once we end up going down the "automatically modifying users' commits" I'm a lot less against (and, given enough time, may be actively for) merging the bottle commits into their commits. It's a bit annoying jumping around in Git history and ending up on the commit changing a formula but which doesn't have the new bottle SHAs yet (so bottle downloads will fail).

All 19 comments

  • BrewTestBot should be able to identify pull requests where the commits do not follow our commit message standards. This would take the form of an additional GitHub Actions check

If I understand you correctly, this feature is being designed with the not-so-rebase-savvy contributor in mind. What concrete action would we expect the contributor to take when they see the failed check?

  • BrewTestBot should, when merging formulae, be able to automatically squash and rebase the commits to conform to our style.

Just for clarification: the commits would be cleaned up when the maintainer presses _Merge,_ i. e. at the last possible moment before the changes get pushed to master, right?

What happens if the commits can’t be cleaned up automatically? How would maintainers be able to tell before pressing _Merge?_

  • BrewTestBot should be able to identify pull requests where the commits do not follow our commit message standards. This would take the form of an additional GitHub Actions check

If I understand you correctly, this feature is being designed with the not-so-rebase-savvy contributor in mind. What concrete action would we expect the contributor to take when they see the failed check?

f a contributor doesn't know how to rebase their pull request, there's little we can do except to have BrewTestBot do an autosquash, which is the goal of this feature. If BrewTestBot doesn't know how to do the autosquash, the maintainer would have to do so. Either way, maintainers need to know whether a pull request could be autosquashed when BrewTestBot performs the merge, or if they will need to take manual action via brew pr-pull or similar.

  • BrewTestBot should, when merging formulae, be able to automatically squash and rebase the commits to conform to our style.

Just for clarification: the commits would be cleaned up when the maintainer presses _Merge,_ i. e. at the last possible moment before the changes get pushed to master, right?

Merging is disabled on homebrew-core, and squash-merges only apply for changes that don't need bottles. The commits would get cleaned up by BrewTestBot before it pushes to homebrew-core.

What happens if the commits can’t be cleaned up automatically? How would maintainers be able to tell before pressing _Merge?_

The GitHub Actions check would tell them.

@jonchang Oh now I see. Thanks for elaborating!

In that case, I’d suggest that we map the following conclusions in order to be in line with user expectations:

  • _Failure_ (red āŒ) for when commits aren’t compliant and the action finds out it can’t clean them up;

  • _Success_ (green āœ…) for when commits are compliant as they are.

  • _Success_ (green āœ…) for when commits aren’t compliant but the action has figured out it can clean them up automatically. If Github supports it, it would be nice if the check result also hinted on the fact that something’s not quite perfect right there on the main pull request view:

    Commit messages will be fixed automatically on merge.

I think it's good to have some sort of warning that commits were formatting incorrectly, even if they can be automatically fixed. This could be a message in the CI window (if there's a way to have an e.g. "commit format" check that can be in a failed state without blocking completion of the PR), automated comment on the PR, label etc...

We still want people to be in the habit of formatting their commits correctly. Otherwise, it will be even more painful when maintainers need to do some manual work and more confusing for contributors who are trying to understand why the commit history on master doesn't match what they did.

I think it's good to have some sort of warning that commits were formatting incorrectly, even if they can be automatically fixed. This could be a message in the CI window (if there's a way to have an e.g. "commit format" check that can be in a failed state without blocking completion of the PR), automated comment on the PR, label etc...

If the commit style violation is going to be fixed automatically, then a failure state may confuse both maintainers and contributors.

A failure state confuses contributors: upon seeing the failure, they will be in the same place as today when a maintainer asks them to reword, squash or rebase.
A failure state also confuses maintainers: even if they happen to know and remember that it’s not a critical failure, it may dilute other failures, which abound even today, including false positives from CI.
I think a failure state is also going to pile even more cognitive baggage onto an already complex decision tree that is merging a homebrew-core PR.

We still want people to be in the habit of formatting their commits correctly.

Even if a contributor is 100 % in that habit, violations may occur for reasons outside their control (e. g. applying review suggestions, maintainer commits) or by accident.
Either way, assuming they have no Git history rewriting skills, they find themselves trapped in the failed state without undo, which I feel is poor UX.

Otherwise, it will be even more painful when maintainers need to do some manual work

Can you elaborate about the manual work? I’m failing to imagine an example here.

and more confusing for contributors who are trying to understand why the commit history on master doesn't match what they did.

That’s a good reason for stating this in the check result, visible right from the pull request tab. But please in green.

Yes, after thinking about this more I think you're right.

A failure state confuses contributors: upon seeing the failure, they will be in the same place as today when a maintainer asks them to reword, squash or rebase.
A failure state also confuses maintainers: even if they happen to know and remember that it’s not a critical failure, it may dilute other failures, which abound even today, including false positives from CI.
I think a failure state is also going to pile even more cognitive baggage onto an already complex decision tree that is merging a homebrew-core PR.

Good point. I was assuming maintainers would be aware and could ignore the issue, but that's not a good assumption. Plus, if this change is intentionally showing an error that will confuse contributors and be ignored by maintainers, there really isn't a point in showing the error at all. It will only add unnecessary hassle for everyone.

Can you elaborate about the manual work? I’m failing to imagine an example here.

I was thinking about instances where a maintainer would need to adjust e.g. the commit structure or message of the PR (using pr-pull) but I forgot that this becomes irrelevant if this is implemented. I wasn't thinking hard enough about what I was typing so this can be ignored... 😳

That’s a good reason for stating this in the check result, visible right from the pull request tab. But please in green.

100% agree. Plus, there are other ways to address the issue of contributers being confused by our merge process (e.g. add a link to the docs that explain our process to the comment that BrewTestBot adds to closed PRs)


Let me rephrase my concern as a question: If auto squashing is enabled and works most of the time, is there a point to requesting commits in the current style? I'd argue yes, but I don't think I have a concrete reason why. I think the best argument is that it's easier to follow if you're comparing commit history to PRs, but maybe this isn't an issue. If so, we can come up with the best way to display that information is an automated way that's clear to the contributor, but also is not a failed state that would confuse anyone.

Let me rephrase my concern as a question: If auto squashing is enabled and works most of the time, is there a point to requesting commits in the current style?

Good question.
Requesting it feels right to me but I can’t figure out why either šŸ¤”

I wonder if it would be interesting that we add bottle commit into the PR and merge as it is. (this would make the PR merge more intuitive)

NOTE, lots of image compression actions can do in that way.

If I recall correctly, pushing someone else's branch requires non-standard GitHub token and when using this token, workflows are triggered again. Might workaround it by skipping workflow jobs if GitHub actor == BrewTestBot tho, but it feels more hacky than it is right now.

But I think this solution would be great anyway, if figured out how to not trigger workflows again.

Yeah, so if we can figure out the checks to detect where we can auto-squash, you can imagine a feature where BrewTestBot (for single formula changes) pushes up the bottle commit then does a squash and merge like the GitHub UI has. Contributors will finally be able to see purple pull requests.

We've tended to avoid workflows that modify contributor branches because it can be confusing, but if it happens _right before_ a merge it's probably not going to be a big problem.

If I recall correctly, pushing someone else's branch requires non-standard GitHub token and when using this token, workflows are triggered again.

Would that mean the decision whether or not to trigger a workflow depends on the push method (tokens vs. SSH)?

I regularly use SSH to push my changes to other people’s feature branches. Like so:

git push [email protected]:other_user/homebrew-core @:other_users_branch

I just tested and it triggered the checks anyway.

No, Dawid is referring to the default GITHUB_TOKEN that is automatically provisioned for all Actions workflows and is attributed to github-actions[bot]. Pushes using this token do not trigger additional workflow runs. Compare to our personal access token HOMEBREW_GITHUB_API_TOKEN that gets attributed to BrewTestBot, and pushes with this token do trigger additional workflow runs.

No matter which push method is used, workflows are triggered anyway. I just tested pushing with default GITHUB_TOKEN and:

remote: Permission to pack-bot/homebrew-test.git denied to github-actions[bot].
fatal: unable to access 'https://github.com/pack-bot/homebrew-test.git/': The requested URL returned error: 403
  • _Success_ (green āœ…) for when commits aren’t compliant but the action has figured out it can clean them up automatically.

Note: doing this for brew style would also be nice. I _think_ this would just be a matter of replacing all our calls of brew style with brew style --fix and making @BrewTestBot commit the results.

If the commit style violation is going to be fixed automatically, then a failure state may confuse both maintainers and contributors.

Agreed. No reason to make humans do what machines can.

We've tended to avoid workflows that modify contributor branches because it can be confusing, but if it happens _right before_ a merge it's probably not going to be a big problem.

We're not also guaranteed to have permissions (depending on user settings).

Something like https://github.com/veggiemonk/skip-commit/issues/5 could be useful to allow us to push and have it skip CI automatically in this (and perhaps other cases). This might also be useful for doing "manual review but don't run CI" PRs with changing many formulae at once.

I guess one question we could ask is how, uh, committed we are to having bottle commits vs squashing together the version change + bottle update.

I guess one question we could ask is how, uh, committed we are to having bottle commits vs squashing together the version change + bottle update.

I think separate bottle commits are nice. They have a different author and clearly differentiate between the user change and the @BrewTestBot change. I do think e.g. a brew style --fix commit could be part of the bottle commit, though?

I do think e.g. a brew style --fix commit could be part of the bottle commit, though?

That means we’d lose a little luxury: today, I can be pretty confident that whenever I look at an already-merged commit, I can trust that what the commit did doesn’t break the style rules of that time.

If we choose to squash the style --fix result into the bottle commit, then the _contributor’s_ squashed commit would now introduce the original style violations to the linear Git history.

I’m sure I’d eventually cope but having style fixes engrained in the linear history doesn’t feel right.

On the other hand: you have a good point about separating user change and test-bot change.

That means we’d lose a little luxury: today, I can be pretty confident that whenever I look at an already-merged commit, I can trust that what the commit did doesn’t break the style rules of that time.

Yes, that's a fair point (particularly if we're altering the commit message anyway).

I’m sure I’d eventually cope but having style fixes engrained in the linear history doesn’t feel right.

Agreed šŸ‘šŸ»

Inspired by what @claui just said: I would say that once we end up going down the "automatically modifying users' commits" I'm a lot less against (and, given enough time, may be actively for) merging the bottle commits into their commits. It's a bit annoying jumping around in Git history and ending up on the commit changing a formula but which doesn't have the new bottle SHAs yet (so bottle downloads will fail).

Was this page helpful?
0 / 5 - 0 ratings