Triplea: Automatic code fixes

Created on 23 Feb 2018  Â·  20Comments  Â·  Source: triplea-game/triplea

It does not make sense for us to fix by hand what can be fixed by the machine. Getting deeper into this code base, the more automatic cleanups we do up-front, the more we profit and spend more time on deeper changes.

I'd like to grab a list here of things we would like to mass run via IDE. These updates we would agree on a time to do and we would apply and merge them in then. No hand updates, no review beyond perhaps a self-review that is just spot checking.

This issue is to discuss what fixes we'd rather have on the whole even if it does introduce smaller problems; and to be sure we're okay with these updates getting run and merged. The reason for no hand-updates is that we want to take advantage of machine CPU power to save us time, get these things fixed, and without causing or incurring merge conflicts.

Of important mention, anything we get fixed in mass we'll be able to more easily keep fixed. ie: we can re-run these analyzers, and fixing just a few examples is then pretty trivial and can be done anytime. As is, parens fix in mass alone will update 2000 locations : |

Okay, to get started with a working set of initial suggestions:

  1. parens fix; (example: d4231f42aae506a1288529069a5fe35becdaab27)
    add clarifying parens to compound expressions "Reports binary, conditional or instanceof expressions consisting of multiple terms with different operators without parentheses. Such expressions can be unclear because not every developer is intimately familiar with all the precedence rules of the different operators. This inspection has a quickfix which adds clarifying parentheses. "
  2. 951 : equality fix; when comparing enums or objects that use the Object.equals method, we can use == or != instead
  3. Pointless bitwise expressions: "Reports pointless bitwise expressions. Such expressions include anding with zero, oring by zero, and shifting by zero. Such expressions may be the result of automated refactorings not completely followed through to completion, and in any case are unlikely to be what the developer intended to do. "
  4. assignments in if statement replaced by conditional (example: ba15b74a59c69180ffb2d2c92a2aedc5cc750e21)
    Reports any if statements with then and else branches which are both assignment expressions or both return statements. The same semantics can be expressed more compactly, and arguably more clearly, with a conditional expression. Example: if (foo == null) { bar = null; } else { bar = foo.get(); } may be expressed as: bar = foo == null ? null : foo.get();
  5. if statement negated condition: "Reports if statements which contain else branches and whose conditions are negated. Flipping the order of the if and else branches will usually increase the clarity of such statements. "
  6. redundant else: "Reports redundant else keywords in if—else statements and statement chains. An else keyword is redundant when all previous if branches in the chain don't complete normally because they end with return, throw, break, or continue statement.
    In these cases the statements from the else branch can be placed after the if statement and the else keyword can be removed. "
  7. redundant if:
Reports if statements which can be simplified to single assignment, return or assert statements.
For example: 
    if (foo()) {
       return true;
    } else {
       return false;
    }
can be simplified to 
    return foo();
  1. return conditional expression
Reports conditional expressions which can be replaced by simpler but equivalent expressions. 
Example
→
Replacement
condition ? true : false

condition
condition ? false : true

!condition
value == null ? null : value

value
result != 0 ? result : 0

result
a == b ? a : b

b
  1. trivial usage of functional expression
This inspection reports method calls to methods of functional interfaces which are directly invoked on the definition of the lambda, method reference or anonymous class. Such method calls, including the functional interface implementation, can be replaced with the body of the functional interface implementation, like ((Runnable)() -> doSmth()).run() can be replaced with doSmth().
Discussion

All 20 comments

3120 is a good example of why we still need to be cautious with some of these types of automated IDE refactorings. Extremely large diffs lead to less scrutiny (although I'm not claiming we would have caught #3120 if the diff was smaller, but we might have been more likely to expand the context of the diff and may have spotted the problem).

Yeah, no doubt we would need to be cautious. Some of the more destructive updates we may want to not do because of that. On the other hand, it is not terribly difficult to roll out some changes if we do break something. As part of any mass update we would want to do some fast-follow smoke testing.

@ssoloff out of the list of the 9 proposed, they look pretty safe. Any specific concerns with the ones mentioned? Are there any other mass changes that should be added to the list?

@DanVanAtta I was just pointing out that simply because a change is automated by an IDE, it is not foolproof. When confronted with a 2,000-line diff, some refactorings are no-brainers for review because any errors will be caught by the compiler (e.g. rename variable). However, the refactorings that may change semantics only detectable at runtime need to be reviewed more carefully, in which case, smaller diffs are more easily digestible.

However, the refactorings that may change semantics only detectable at runtime need to be reviewed more carefully, in which case, smaller diffs are more easily digestible.

Yup, out of the proposed refactorings, do you see any falling into that bucket?

Most of these look good. 4 might be a little subjective and the expression should probably have parens around it.

out of the proposed refactorings, do you see any falling into that bucket?

@DanVanAtta Nope (but I'd have to see some samples of how the tool handles (3) to be sure about that one).

@ssoloff i crossed (3) off the list, there were no examples in project : )
I thought there were a dozen, I'm happy to be wrong on this. But then again, I should probably just go ahead and submit the smaller updates on their own; I'll do that and append here the larger ones.

@ron-murhammer I added examples to overview, but for specific review here is an example of (4): https://github.com/triplea-game/triplea/commit/ba15b74a59c69180ffb2d2c92a2aedc5cc750e21 Happily parens fix can be applied on top, so please picture the example as if it had parens. (parens fix dominates, but 4+1 would look like this: https://github.com/triplea-game/triplea/commit/d4231f42aae506a1288529069a5fe35becdaab27)

@DanVanAtta I'd argue this part of the commit is an example where I'd probably say its easier to read with if/else rather than conditional as the conditional seems to get lost in the large expression: https://github.com/triplea-game/triplea/commit/ba15b74a59c69180ffb2d2c92a2aedc5cc750e21#diff-226ebd837732bdb0e9651b2fd21f8988R541

Yeah, that statement really needs to be formatted as

actionButton = (action == MapAction.REMOVE)
    ? JButtonBuilder.builder()
        .title("Remove")
        .toolTip("Click this button to remove the maps selected above from your computer. " + MULTIPLE_SELECT_MSG)
        .actionListener(removeAction(gamesList, maps, listModel))
        .build()
    : JButtonBuilder.builder()
        .title((action == MapAction.INSTALL) ? "Install" : "Update")
        .toolTip("Click this button to download and install the maps selected above. " + MULTIPLE_SELECT_MSG)
        .actionListener(installAction(gamesList, maps, listModel))
        .build();

And if the IDE can't be configured to do it that way, it means a lot of manual changes. :disappointed:

@ssoloff 100% agree. I think there are a pretty good number like that as well. The single line ones are easy but once you get ones that are longer they need to be formatted well to be readable.

@ssoloff parens would be covered by the auto-parens adder: https://github.com/triplea-game/triplea/commit/d4231f42aae506a1288529069a5fe35becdaab27#diff-226ebd837732bdb0e9651b2fd21f8988L541

@ron-murhammer @ssoloff the ? not being on a new line would trigger a checkstyle violation. Auto-formatting fixes to roughly the right format, so we'd be forced to fix them all or could not merge otherwise. After running all of these a checkstyle would need to be run, the added parens for example can increase line lengths; these same problems would be flagged.

Given we'd be forced to format it all, do any concerns remain with if/else assignment to conditional?

I'll note too that once we have that update, in the example given, actionButton can become final, but not only that we can simplify further:

return (action == MapAction.REMOVE)
    ? JButtonBuilder.builder()
        .title("Remove")
        .toolTip("Click this button to remove the maps selected above from your computer. " + MULTIPLE_SELECT_MSG)
        .actionListener(removeAction(gamesList, maps, listModel))
        .build()
    : JButtonBuilder.builder()
        .title((action == MapAction.INSTALL) ? "Install" : "Update")
        .toolTip("Click this button to download and install the maps selected above. " + MULTIPLE_SELECT_MSG)
        .actionListener(installAction(gamesList, maps, listModel))
        .build();

I see a benefit here, once we finished with the automated IDE pass, the cleanup to simplify such cases would be just a couple lines to change, likely could be identified automatically.

@ron-murhammer any objection to current list if things are formatted well?


I'll see if i can get some ballpark numbers for how many changes are involved in each proposed update. I'd like to see about getting some more on the last, will assign to myself as well while working on that.

@DanVanAtta Nope. As long as things end up more readable then they were before and the formatting looks good then I'm good with the list.

ROI is off on this one. Kinda a shame since the actual work to do the code update is very minimal. Meanwhile I have been seeing tons of warnings about various thing that can be fixed...

@ron-murhammer can you suggest a way to proceed? For something like parens, do we schedule a time to do it? Do we find a time when the review queue is low (which does not necessarily mean if there are a lot of in-flight changes or not). Do we just live with the problem? I don't really believe the cost to fix is so high, but maybe it is.

@DanVanAtta Yeah, honestly I think as long as the PR queue is low and maybe giving folks a heads up is all that's needed to tackle some of these. I think trying to also do as many as possible quickly is the best approach to minimize disruption of other PRs.

How about a new PR label that the author can use to indicate "we should _really_ try to merge this PR before any others"?

Well the problem with these types of changes was both ways. So if you do a mass change and mark it as 'merge before others' that then conflicts with 10-20 other in progress branches that all need rebased that is kind of a pain. But if you don't merge the mass change PR then it needs rebased for any of those others that are merged before it. Mostly its trying to minimize conflicts and give folks a heads up if we are looking to do mass update PRs.

@ron-murhammer Agree that we shouldn't be holding things up for one PR (that may possibly require an extended review period).

I should have clarified my suggestion better based on what I thought was @DanVanAtta's immediate concern. That is, that the author is gently suggesting to reviewers that, all things being equal, they should attempt to review the labelled PR before they jump ahead to others. Regardless, such a label should probably be used sparingly. Personally, I haven't seen a problem with the order of reviews in the past year, even when they don't occur in FIFO order.

Lots of issues @ssoloff, sorry to have created any confusion.

FIFO was for predictable merge conflicts and reduce stagnation, the PR queue was often 20+ at the time. Without FIFO stagnant PRs were often getting hammered by conflicts. (Particularly in 2015 we had limited review capacity for some time, it's a true luxury and pleasure to have all 4 of us as reviewers these days)


At issue here is how do we execute on automatic, potentially large sweeping code updates? One thing we should note, any branch can get conflicted, it does not have be open for PR for that to happen. It's harder to forecast those conflicts those as you need to know what is on a local machine.

With automated updates they are generally easy to apply, so it's a matter of finding a good time. They are only a pain to re-apply if there are a lot of hand-corrections. In the case where a large change has hand corrections, it basically just itself becomes a large change and then is likely to incur conflicts.

I've a suggestion for a protocol on how I think we can solve this perhaps:

  • if someone wants to do a mass change, they submit a PR for it with 'MASS CHANGE' in title
  • we review it somewhat as normal, but any requested changes must first be done by hand as its own PR so that the MASS CHANGE is completely automated (those changes need to be done by hand anyways, it's just as well to do them and merge them to master first bore mass changes)
  • 72 hours after a MASS CHANGE PR is open is the window for the team to look at the updates and to notify in PR if they have any in-flight work in areas that are being updated. The mass change then should wait, ideally the person with changes in-progress will try to incorporate the mass changes in their files, or submit those for PR and rebase so they can control the conflicts better.
  • Then when the mass change has nobody with conflict objections, 72 hours have passed, and all would-be requested by hand changes have already been taken care of, then PR author would check out latest master, re-apply automated corrects, do a quick last re-review and then merge.
Was this page helpful?
0 / 5 - 0 ratings