Omr: MIN_INT / MIN_INT evaluates to -1 when TreeSimplifier is disabled

Created on 9 Aug 2017 · 11Comments · Source: eclipse/omr

Reproduction Steps

cd omr
mkdir build
cd build
cmake -GNinja -C../cmake/caches/Travis.cmake ..
ninja comptest
TR_Options=lastOptIndex=-1  ./fvtest/compilertriltest/comptest --gtest_filter=DivArithmeticTest/Int32Arithmetic.UsingConst/28

Error Output

Note: Google Test filter = DivArithmeticTest/Int32Arithmetic.UsingConst/28
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from DivArithmeticTest/Int32Arithmetic
[ RUN      ] DivArithmeticTest/Int32Arithmetic.UsingConst/28
../fvtest/compilertriltest/ArithmeticTest.cpp:38: Failure
      Expected: param.oracle(param.lhs, param.rhs)
      Which is: 1
To be equal to: entry_point()
      Which is: -1
[  FAILED  ] DivArithmeticTest/Int32Arithmetic.UsingConst/28, where GetParam() = ((-2147483648, -2147483648), ("idiv", 0x750a40)) (0 ms)
[----------] 1 test from DivArithmeticTest/Int32Arithmetic (0 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (0 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] DivArithmeticTest/Int32Arithmetic.UsingConst/28, where GetParam() = ((-2147483648, -2147483648), ("idiv", 0x750a40))

 1 FAILED TEST

Preliminary Investigation:

As expected, the tree

n1n       BBStart <block_2>       
n3n       ireturn                 
n4n         idiv                  
n5n           iconst 0x80000000   
n6n           iconst 0x80000000   
n2n       BBEnd </block_2>

Is returned from IlGen. However, this passes without lastOptIndex because the treeSimplifier sees this tree and does the correct substitution with iconst 1.

Without tree simplification however, it appears there's a sign-extending widening happening at some point that interacts poorly with another simplification.

compiler

Source

mgaudet

Most helpful comment

Historically, the Simplifier was not a pre-req to running the code generator. I know that it currently is (even for "no opt" compiles) and has been for some time but I suspect it was introduced as the easiest solution to work around some bug rather than be a baked-in solution. At one point the canonical forms expected by the x86 code generator were fairly limited, perhaps (but probably not) as simple as:

other than floating point expressions, constant operations must be folded prior to the code generator
for binary commutative operations, constants (if present) can only appear in the rightmost child

If the Simplifier wasn't enabled then an IL generator was expected to guarantee these properties. These were introduced for exactly the reasons already stated: to limit the search space of available tree combinations to the code generator. The problem, however, is that the rules were never really documented.

Each codegen has evolved differently over the years. Generally, they share a lot of DNA but each has taken on its own nuances for various good (and sometimes bad) reasons. For example, some may be able to tolerate unfolded constant operations in some evaluators, and just because x86 evaluators do not support unfolded constants doesn't mean unfolded constants should fail IL verification. Note that I can't say that for certain, but I do have my suspicions that some codegens (like Z) are able to handle more cases than the others do.

My feeling is that tree simplification is too broad of a brush to pre-req running the code generator without understanding all the reasons why. I'm not so worried about projects with mature opt strategies that exercise the optimizer as the Simplifier is usually run many times. I am thinking about projects who are concerned with compile-time and are essentially building a minimalist opt strategy that suits their needs. Tree simplification may do far more than they need to get the trees in the right shape expected by the code generators at a greater cost than they want to tolerate.

I am in favour of IL validation ensuring that trees entering the code generator are in a format they expect. However, I think we need to understand what those expectations are first (rather than "just what the Simplifier does"), otherwise the rules may be different per platform. This will be useful to writers of new codegens.

If we understand the minimal IL properties expected by a codegen then it will be easier to write a canonicalization pass to potentially run before codegen.

0xdaryl on 11 Aug 2017

❤5

All 11 comments

lastOptIndex1.txt

mgaudet on 9 Aug 2017

Interestingly, in discussion with @vijaysun-omr , he mentioned that the evaluators were never designed originally to be used in isolation. They were designed originally to be used in conjunction with tree simplification so that the evaluators would not have to deal with all possible inputs. You could call simplification a "mandatory optimization", if you wanted to borrow another project's term for this approach.

Given that, perhaps the best "degenerate" test would be simplification feeding code generation. Pragmatically, lastOptIndex=2 would do the trick with only dead trees elimination added to the mix.

mstoodle on 9 Aug 2017

I definitely understand the preference to canonicalization, however I do have a couple follow-on comments / concerns:

I am guessing here that we don't really have a strong handle on what, precisely, is the required level of canonicalization that must occur, other than "Whatever Tree Simplification Does".
From a servicability perspective, doesn't this requirement really affect problem determination? If the code generator is free to produce wrong code if you skip a Tree Simplification pass... that seems awkward.
If we need tree simplification to feed the code-generators, shouldn't it be a mandatory pass appended to the end of an optimization strategy, or perhaps a legal optimization strategy must defined as include tree simplification after all opts that may generate trees? I'm imagining a late lowering pass that under certain circumstances may generate something as simple as the above idiv.
From the perspective of testing, it would be much better to have a verifier that asserts the 'canonicalized' property vs. making tree simplification mandatory. I'd rather the onus of creating canonicalized IL land on a test case writer (so long as they have a verifier to ensure they get it right) than allow tree simplification to radically alter the trees under-test.
Do we want to close this issue? Or is the fact that every other input works suggest that there's a bug underlying here, outside of the tree simplification root?

mgaudet on 9 Aug 2017

So I will +1 Mark's comments - the code generators rely on tree simplification to reduce the complexity of the state space they need to handle and so running evaluators in isolation is not really a 'supported' configuration. In respect to the last comment from @mgaudet :
1) The authoritative definition is what tree simplification does, but that does mean, in general, constant folding, commoning, ordering children etc. A more precise definition would be nice, but this is the current state of the world.
2) Tree simplification passes are run very frequently during optimization - most opts just anchor children and mutate nodes and leave it to TreeSimplification to clean things back up - dealing with complex tree manipulation without this approach adds unnecessary complexity to opts. The first pass of simplification is the main one that gets things into the 'normal' form from the ilgen. Most opts don't destroy the normal form, they just may not complete all available commonings or unanchor children as efficiently as they might otherwise.
3) You don't need to run a trailing tree simplification pass if the opts being run do not destroy the normal form. Forcing a run of the opt would burn compile-time for no real gain. Further, the tree simplifcation pass does different things when run with marklastrun (its last pass). Some of the things done in this configuration are controlled by codegen hooks - do you prefer muls or shifts etc. since different codegens have evolved different preferences. Not an ideal design, but that is the current state of the world.
4) Late lowering passes are expected to preserve the normal form for the most part so the idiv would violate that so the late lowering pass would need to avoid generating it.

If someone wants to have a look at why the case fails I'm fine with that, but if it comes down to two consts then it is fixable (provided it isn't too far reaching etc in terms of what else has to change). A lower priority issue IMO.

andrewcraik on 9 Aug 2017

So, going forward, I see one of the goals of Tril to allow us to reason about the correctness of smaller components of Testarossa than we are currently able to (Hopefully @Leonardo2718 agrees with me on this one!).

To me, I think it's important that we are able to verify that a code generator is able to handle all the valid trees you can throw at it. So, it seems to me that a key to being able to test a code-generator independent of the optimizer will be the ability to verify the canonicalization properties. Ignoring Tril, that feels like a useful parallel to the IlValidator (and, if we were able to enforce canonicalization, we could possibly even simplify some of our current evaluators that do 'too good a job' today).

With this example, I'm curious about what canonicalization property we'd want to check to invalidate this one on: Just that the idiv evaluator needn't handle two constant operands?

(I'm up for starting work on a canonicalization validator as an interesting project)

mgaudet on 9 Aug 2017

Dodging the rest of the conversation for a bit to let others voice opinions...

Isn't the test case that failed actually testing MIN_INT / MIN_INT rather than -1 / -1 ? Or is that just a different example that also failed?

mstoodle on 9 Aug 2017

Hahaha. No, you're totally right. Changing title!

mgaudet on 9 Aug 2017

So, going forward, I see one of the goals of Tril to allow us to reason about the correctness of smaller components of Testarossa than we are currently able to.

I think I agree with the general idea here. I definitely think there's a need to be able to reason about and test smaller components (units) of Testarossa, but Tril is only half of the answer. There also needs to be a way of somewhat isolating these units. I won't go into details here as I think it diverges from the main issue.

For the purpose of this discussion, I would consider it major progress if we can formulate a good answer to @mgaudet's question (maybe @andrewcraik, @vijaysun-omr, and @mstoodle can help):

With this example, I'm curious about what canonicalization property we'd want to check to invalidate this one on: Just that the idiv evaluator needn't handle two constant operands?

We can use this to drive later discussions about validation and testing :slightly_smiling_face:.

Leonardo2718 on 10 Aug 2017

other than floating point expressions, constant operations must be folded prior to the code generator
for binary commutative operations, constants (if present) can only appear in the rightmost child

If we understand the minimal IL properties expected by a codegen then it will be easier to write a canonicalization pass to potentially run before codegen.

0xdaryl on 11 Aug 2017

❤5

@andrewcraik / @vijaysun-omr Do you have any suggestions on a first pass for what minimal properties Codegens would be allowed to assume, or perhaps more concretely, in the spirit of the issue discussed here: Should we start by saying that idiv should never be expected to handle two const args?

mgaudet on 8 Sep 2017

My opinion is to start with :

1) other than floating point expressions, constant operations must be folded prior to the code generator
2) for binary commutative operations, constants (if present) can only appear in the rightmost child
3) no branches to the next block in tree order, e.g. an if that branches and falls through to the next block

No other cases come to mind at least from the viewpoint of trivial sub-optimal trees that we historically wanted to avoid handling in a bunch of evaluators on all our platforms.