Triplea: Formalizing the PR approval Bar

Created on 26 Aug 2019 · 23Comments · Source: triplea-game/triplea

Topic and Overview
A few times I've submitted PRs at night hoping they would be merged by the morning to then continue project work with another iteration. This has worked, but other times it has not with comments in PR where my response is "I'm trying to delete that in the next iteration, it should not be modified".

I think it's important that when reviewing we:

keep in mind the intended scope of the update and not increase that. It's intentional for the author to do some refactors, and not others. Suggesting refactors to existing code, code that was skipped over, or code that was only partially improved is second guessing these original decisions.
style comments will slow things down. Instead of submitting reasonable updates, we'll wonder what others think of the style and will hesitate. Enforcing style on others also slows things down.

We should be keeping in mind our net effectiveness and goal for this project. Clean code is a secondary goal, not primary.

I'd suggest then we discuss the 'bar' for approval and formalize it a bit. I'd suggest the standard for approval should be:

correctness of changes
completeness of changes, if a javadoc is left not updated, it should be updated
reasonably clean 'new' code. If for example a new method or class is added, it should be reasonably clean.
reasonable design of new features. Sometimes we will borrow from existing patterns, new features should otherwise be well done
maintainability. If there is a lot of hacky code that is updated and it makes no sense before or after, improvements should be made so that the next maintainer will not be struggling. This reduces the cost of maintenance when done well. For example if a "magic" one line fix is added somewhere that fixes a bug and makes no sense why it is a fix, a comment should be added to explain why the one line is a fix. This way a future maintainer will benefit from that information.

A good example of something that fits outside of this guideline is a move method refactor. While often it will not be strictly needed, moving methods, particularly private methods to where they are used is a good thing to do. That method will show up as new code, but it is only moved. Where I think the line should be is for updates to code that is simply moved. Here the intent is a 'move method' refactor and not more. It's also a bit risky to refactor anything/everything and it makes for a difficult review. Hence, asking for the method to also be refactored beyond the move method I think is where we get into trouble.

The importance of this is for time efficiency. Waiting for approval before starting another iteration has me often working on 3 different projects as I need to wait for several iterations in parallel before any of them can be completed. Waiting even hours to begin the next iteration is a large amount of time. If we've only 2 or 3 hours in one day to work on TripleA, and it's a few days before we get another chance, such a delay adds a week-long delay. These are serious delays to an under-capacity team.

How to handle the optional stuff?

Clean code is very important to TripleA and it would also be bad to discourage us pointing out places where it could be better. I'd suggest we, when reviewing, do one of two things:
(1) point out the refactors and approve. If the updates are trivial enough, the author simply updates, pushes, and merges when the build is done.
(2) point out the refactors and mention the change would be out of scope. I personally do this, for example, by prefixing comments with "side-note:". Such changes then are okay to follow up with, or not at all (depending on time and priorities).

Desired Outcome

Consensus of when we should approve PRs, remove round-trip delays in PRs to encourage more efficient development iterations.
Documentation in wiki on PR process for criteria of when PRs are approved

Motivation
Lots of TripleA code needs many iterations. Refactors are extremely hard to review, most need many iterations. Meanwhile new code is easy to plop down and review in one go. This creates a perverse incentive where refactors are knee-capped, the time to do 5 back to back PRs that build on each other will be days, if not longer. Many refactors would need more iterations than that, and we have a lot of code to rewrite and clean up. We're not going to get there if the smallest of a series of iterations takes days.

Discussion

Source

DanVanAtta

Most helpful comment

Agree with both points.

If its clearly divided then either we take some kind of vote across core members (probably forum admin group) or I'll make a decision :)

For experimental features, I'd say have a flag if it can be cleanly done. I don't think feature branches will ever really get tested by enough people to be worth it.

ron-murhammer on 10 Sep 2019

😄1 👍1

All 23 comments

I agree that it can be frustrating if the reviewing phase slows the process of development down.

I decide on "approve"/"comment"/"request changes" as follows

approve: PR is perfectly fine, maybe a typo or some other no-brainers are left
comment: PR looks fine, I might have some concerns somewhere, but everything seems alright
request changes: I found an absolute deal breaker, that might cause crashes or silently fail.

There are some grey areas though, especially if I don't quite understand a piece of code but could imagine something being wrong with it. In those cases I'm more likely to chose "comment" or "request canges" even if I potentially misinterpreted something and the code is perfectly fine.

Agreeing on those terms would be nice, but at the same time I don't think anyone is going to freak out if a core-dev merges a comment/approve PR and opens another PR afterwards to address the issues raised there, including potentially diffferent changes.

However one thing I noticed is that I'm personally less motivated to address issues with some piece of code when there isn't an open PR for this. So I consider having a little bit of "pressure" to be a good thing when it comes to PRs

RoiEXLab on 26 Aug 2019

👍2

I found this interesting: https://blog.asana.com/2016/12/7-ways-to-uplevel-your-code-review-skills/#close

Always give approval, unless you can prove that there is a bug

When you suggest changes, assume that the author can handle the changes. Slowing them down to wait for a second round of code review is almost never worth it, especially if the changes are simple things like renaming variables or extracting a duplicated method.

If you make authors wait longer because of style, the code will get worse in the long run. Slowing down changes makes people reluctant to submit small changes that clarify and clean the code.

If you don’t feel qualified to give approval, say it in those words and come up with a plan to get the right person to look at the code. When you forget to click “approve” the author of a change doesn’t know if you forgot, or you think the code is broken, or you don’t care if they are blocked. Help them feel your good intentions by being very clear about what the next step is.

I also read somewhere else that it really is the job of the reviewer to ask questions. Something not clear, ask a question.

@RoiEXLab I think this perhaps could evolve:

There are some grey areas though, especially if I don't quite understand a piece of code but could imagine something being wrong with it.

I generally assume if I can't understand code, other maintainers will not. I strongly believe code should be not an exercise of sitting in a quiet room with a strong cup of tea and fire to study how 10 lines of code work for 3 hours. I don't believe that scales, and the idea that some code can only be understood by masters I believe is a cop-out for hard to understand and frankly poorly written code.

Digression aside, if I can't understand code, I ask questions about it and/or ask if it can be made clear in other ways. I think it's fine to say, "I don't understand this, can this be made more clear in some way? Perhaps with a comment to explain what is happening?"

I've read it is the job of the reviewer to understand code. So I think I would draw a distinction between, "This is hard to understand and I'm not sure it works", vs "this is crazy and I've no idea how it works."

I think for the former we can ask that an item be improved and give approval assuming the author can improve it. For the latter where there is a complete lack of understanding, then perhaps we can hold approval.

In part the problem is not approving is a blocking action. It's not the case that others are going to necessarily come along, have a better time, and approve. A review that fails to understand I think needs to more concretely have next steps and not leave the PR in limbo.

DanVanAtta on 27 Aug 2019

FWIW, i found this was a really good read on the CR topic: https://mtlynch.io/human-code-reviews-1/

DanVanAtta on 27 Aug 2019

Eventually I think it will be good for us to formalize the approval bar and have the criteria written down. Perhaps we should begin the negotiation so that there is no grey zone. I think as well the bar should be a bit higher for non-maintainers (those without write access). Implicitly the people with write access are on the hook to fix release blockers. If you're in charge of fixing your own bugs anyway, not that there should be less care, but I do feel that is different from submitting a patch and then walking away.

DanVanAtta on 27 Aug 2019

@ron-murhammer please weigh in if you have any thoughts here. The goal is for us to try and maximize our code throughput while maintaing relatively low risk for subtle and/or release blocker bugs.

DanVanAtta on 27 Aug 2019

I'll hazard a list of an approval bar. Before that I'll state my main goals here:

remove minor stylistic blockers
allow for author to add fixes and merge and skip that last round trip of review. With in-flight changes depending on PRs, the multi-day delay is bad, inefficient, slows us down and encourages PRs to be larger and less frequent
increase focus on delivering value, code correctness and reasonable clarity. There have been too many examples of blockers to perfect code that was later completely removed.

Approve if the code is:

correct
tested, manually or automated (preferablly automated)
change is complete within the intended scope.
- comments + javadocs still current
- if there are existing unit tests, they are updated
- methods and classes files remain consistent within themselves
understandable and somewhat clear, reviewer and future maintainers should be able to at least mostly understand the code going forward.
classes, packages, DB tables are well named
- these types of entities are difficult to rename later

Comment and still approve if:

nice to have stylistic changes, eg:
- imperative loop vs streams
- nicer/cleaner syntax
- out of scope updates
minor clarity improvements, eg:
- code is still pretty understandable, but would be better if...
minor naming improvements (eg: better method parameter names, local variables)
just questions that certain cases have been

Side-note, probably best to keep non-critical comments to a mininum to avoid overwhelming a review with unnecessary feedback.

Do not approve if:

conventions violated (for example, by convention variable names must be descriptive, shouldn't be single letter like 'x')
any of the 'approve' criteria are not satisfied

I'll emphasize that the non-blocker but approve comments rely on the author reasonably responding to the nice to haves and using their best judgement to address those comments, push changes and then merge. My point is essentially that if a reviewer says "this local variable name is not quite clear", it's not license to just ignore that, the author should think about it and try to address it, and if not would respond to the comment. In either case there is no round-trip back to the reviewer and back to the author.

DanVanAtta on 28 Aug 2019

I committed these notes to the wiki: https://github.com/triplea-game/triplea/wiki/Code-Reviews

@ron-murhammer @RoiEXLab , please discuss here and/or feel free to update the wiki.

Unless there is objection I'd like us to start moving forward with allowing author to merge when responding with "trivial" updates like name improvements and also lower the review bar when it comes to style.

DanVanAtta on 28 Aug 2019

One question I've been wondering, what kinds of changes, if any, need no review?

DanVanAtta on 28 Aug 2019

I think I generally do a process similar to @RoiEXLab. I'm fine with what you have in the wiki.

The only thing I would say is there is also a sense of "is this change a positive change for TripleA and/or is there a consensus around it". This is only really relevant to features or updates not bug fixes or refactoring PRs. But I think the recent PR around adding buttons to the in-game bottom bar falls into not passing this change which is why I wanted to block the PR until we had discussed it preferably on the forum so other non-technical core members can contribute as well.

ron-murhammer on 29 Aug 2019

Good example to discuss, though sometimes the pace of discussion can be too slow on forum, and it is possible to change things. So we don't have to get things completely perfect, and that example turned very myopic to one aspect of that update.

I think where the problem really comes in is if we have a detailed stylistic review to then see a PR get blocked for a larger issue. That example does start to digress from approval bar as once a PR is blocked it should not be simply blocked but a furtherance of a conversation of what can and should be done. I'd also point out that there is still no great consensus on the feature or a spec about how to do it right, now nearly 2 weeks later.

Coming back to this topic, the question I think is also how do we build consensus without necessarily devolving into design by committee. Perhaps when we are adding UI features to fix UX problems, we merge but continue the conversation and then look to do a final fix or removal before release? In this way we treat the pre-release more of as a beta and can have the conversation continue without code waiting. (The code waiting causes merge conflicts, excess communication, baby-sitting and watching to see if something will be merged so it can be followed-up on, additional updates may be in-flight waiting, and it adds yet another parallel project). Going this path, we can also have people demo the features and develop a better opinion rather than a gut reaction based on screenshot without gameplay.

For some time now I've been working on 3 o 4 different things at once so 2 or 3 PRs can be waiting while I continue to work on something else and avoid blocking. This is hard to sustain and is making it hard to complete any one complete path. We could start to simply request that we have larger PRs be submitted, I don't think that's a good experience or can lead to really valuable reviews (2000+ line PRs can't really be reviewed).

In summary, it does seem we should discuss how we can efficiently accept/approve UI updates, particularly when the community is somewhat divided or not able to give adequate feedback in a timely manner.

DanVanAtta on 29 Aug 2019

Also, I've been wondering if we should also think about the other side of the review, perhaps authors can do more to make code easier to review? I'm thinking of the moved method case. If the author comments the first place where a method is added or removed, and note that it's simply moved, then the reviewer can more easily skip over the contents of the method.

This topic is a bit deeper than I expected, summarizing open questions:

how to formalize stylistic comments and avoid them being blockers? Are we okay with commenting but not always requiring those updates to be made? The expectation I think is an author will make an attempt to address the comments, but would merge as soon as some improvement was made.
how to formalize community feedback on UI features. Open ended topics asked to a dozen people is not a good way to get consensus, but community feedback is valuable and should be gathered. How can we do this efficiently? Should we open these questions, press ahead in the pre-release and make final decisions before we go to release? Any other ideas?
Should we formalize more tasks for author of PRs to do to make the review easier? Perhaps we could add a "scope" section to PRs so authors can state explicitly what they were trying to do. We could also come up with some guidelines of comments authors should leave in the diffs to help guide reviewers. Perhaps instead of saying scope, comment on pieces of code like "this is just moved"

DanVanAtta on 29 Aug 2019

@DanVanAtta it might help if the person that creates the PR tries to make smaller commits for trivial but hard to spot changes.
If you make a single commit for "moved method a", "moved method b" each, that might be a little bit more work for the developer, but makes it super easy for the reviewer to verify the change is actually correct

RoiEXLab on 29 Aug 2019

Good suggestion @RoiEXLab
I find that can be difficult as I'll take a large set of changes often and then tease out anything independent. Often when approaching a refactor I will not know which steps are going to be successful and needed. In that case I'll have something like 30 commits and then realize I need to backup, apply more changes, and then sometimes realize that 2 or 3 files can be moved to their own branch. I don't know if people are generally always good about following a set of changes commit-by-commit, and I've heard the criticism from people that it's hard to do that and have a sense of the end direction.

Perhaps an all-the-above approach should be taken? Either the author structures a PR to be reviewable commit-by-commit, or they add inline comments to point out those things, thoughts?

DanVanAtta on 29 Aug 2019

Open question:

Should UX changes be blocker for PR or for release?
I'm okay with asking for UX changes in a PR, but we need to stay reasonable and avoid the "helping the project by stopping others" vs "helping the project by enabling others"

Seems like we are agreed:

PR author should note things like move method as comments or structure commits.
bar is lowered for stylistic changes or non-significant changes; specifically we approve if only such changes are requested and they can be updated and merged without further review. We're trusting PR author will make a good faith and honest effort to address those questions/concerns, and if not then respond with the dilemma they are facing.

DanVanAtta on 3 Sep 2019

I don't think its necessarily just UX changes as changes to the core game/rules probably fall into this category as well. Its more around if you are making a significant end user impacting change then it should probably have a feature request on the forum and some sort of agreement from a few core members (doesn't have to be just devs but I would include the forum admins as well).

The other thing I'll add is that often we forget that any significant end user change needs to be considered across all maps (this is particularly applicable to UX and core game/rules changes) and tested across at least a somewhat representative set.

ron-murhammer on 6 Sep 2019

👍1

I don't have anything more to add

RoiEXLab on 6 Sep 2019

Flags was interesting as there was no response on the feature post in the forums until it was up for PR. I think the responses were triggered by the presence of a PR. It was tested on a variety of maps and the bottom bar did look fine on a variety of maps except for the specific scenarios where the bottom bar already had problems. It was also interesting as that update was more about the unit highlight, flags was added for good measure to fix the UX anti-pattern of magic hotkeys with no corresponding UI control.

So, I think we're on the same page for posting on forums and getting feedback, but I'm concerned we can get bogged down pre-designing by committee all updates. Feedback is good, particularly for the sexy features, but for the more mundane items where nobody really cares beyond slight approval, I'm concerned we might bias for pet peeves. In other words, if nobody really cares, but there is a pet peeve, the conversation can be become loud rather than helpful.

For flags, I did not realize that feature was so disliked. My main goal was to fix the UX anti-pattern fo magic hotkeys and felt that was important for unit highlight, flags was dragged along for consistency.

To that extent, I'm curious how we can do a better job with iterative development, and enable demo of features to enable better feedback. Is it simply enough to ask reviewers to effectively not wield a veto and instead be specific about paths forward? If a reviewer does not think a feature is fully baked, I'd be fine with a reviewer reserving the right to request a feature be treated as a beta and request it be easily disabled if needed. In the case of 'flags' that would have simply meant not removing the view menu code and putting that behind a feature toggle instead.

I think it would be really healthy for us to get more in the practice of demo'ing features and let them bake in and gain feedback from experience, and enable iterative development. As such, I don't think we can realistically have every feature updates be perfect at time of PR, nor have fully vetted forum feedback.

So, if a PR does not meet a reviewers bar, is it okay for them to just wield a veto? Should we instead get into the habit of requesting specific changes, and failing that just request that a feature go behind a toggle so it can bake and be demo'd?

DanVanAtta on 7 Sep 2019

@ron-murhammer , @RoiEXLab ; what do you think about having the following two bullet points? If agreeable, I'll add them to the project process wiki.

(1) PR authors are responsible for achieving consensus on updates. It's recommended to post to forums about upcoming changes to help begin that conversation.
(2) PR reviewers should be explicit when a PR fails to hit the approval bar and try to ensure the last comment is a list of action items the author would need to satisfy for merge. Reviewers should, as best they can, work with PR author to recommend and come to compromise solutions.

Last, I'm left wondering a bit one final thing. For updates where the community is potentially divided, some things are going to be just subjective, how can we proceed on those items? Also I'm wondering about the items that would benefit from iterations and play testing. Do we try to develop and iterate on those items from a feature branch? Do we encourage the use of feature flags to hide the feature while it's being refined and updated? Is there another option to consider?

I don't think I like the idea of feature branches being used for demo's of beta features, not everyone can build installers, it's a bit painful of a process and encourages the long-lived branch development process anti-pattern. I'm thinking the response for this project for "not everyone is loving this feature" should perhaps be "put it behind a flag, gather more feedback, decide it's fate later, whether that be remove or improve"

DanVanAtta on 9 Sep 2019

Agree with both points.

If its clearly divided then either we take some kind of vote across core members (probably forum admin group) or I'll make a decision :)

For experimental features, I'd say have a flag if it can be cleanly done. I don't think feature branches will ever really get tested by enough people to be worth it.

ron-murhammer on 10 Sep 2019

😄1 👍1

Having a flag will work when a feature is planned and implemented, however most of my "features" or better "feature removals" came as part of a larger refactoring that tried unifying some code.
For my drawing code refactoring which resulted in a lot of deleted code, which added seamless zoom as a by-product, because the PR basically moved the scale control into a single point, a feature flag is simply not feasible.

So there are definitely some grey areas, but in general I agree

RoiEXLab on 10 Sep 2019

I can poke some problems with voting:

what if the deciding votes are AFK for extended period. EG: 5 people, and 2 of them are on vacation, one of them is author, then other two are weak no's
A vote implies no compromise, yes/no; shutting down an initiative is a very poor collaboration. I think we should be instead thinking of alternatives and ways that initiatives can move forward.
Weak opinions in a vote might not get expressed, I fear that loud voices would rule the day
Not all decisions can be understood to the best level. Maybe one choice seems obvious, but when it comes to implementation the effort is magnitude more difficult. It's like saying, sure you can do that, but make it 10x harder for 20% more gain.

I think it comes down to instead of having a mindset "is there a better way?" rather than "we need to say no"

I just don't think it's that feasible to get full information before everyone in an effective manner so that everyone is casting a vote with the full picture. That can even vary between maintainers depending on how deep of knowledge goes to a specific code base. It's for example if @RoiEXLab and I voted for a much more difficult AI change compared to one initially presented.

I really do think we need to try to focus on guiding decisions, as a group, rather than necessarily making decisions.

DanVanAtta on 10 Sep 2019

updated wiki with last two bullet points that PR authors are responsible for consensus and reviewers should try to help guide that consensus as much as possible. No actual disagreement here to vote, but I'd mention it is potentially more a consensus gathering tool sometimes.

DanVanAtta on 14 Sep 2019

Closing, this horse has been beat to death pretty thoroughly, no actionable items remain; Glad we had this discussion though; it's important we keep round-trips to a minimum and move as effectively and quickly as we reasonably can.

DanVanAtta on 14 Sep 2019

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Roundel order inconsistent for diplomacy phase

General-Dru-Zod · 5Comments

Map/Download Versioning

Cernelius · 8Comments

Compatibility Testing: Document and Automate

ron-murhammer · 6Comments

Error while checking for engine updates

panther2 · 6Comments

Map Creator cant access C:

Khobai · 9Comments