if statements in Jul are broken
pragma solidity ^0.5.0;
contract BugTestCase {
function Example ()
public pure returns (uint256, uint256)
{
uint256 branch_taken_asm = 0;
uint256 branch_taken_sol = 0;
uint256 value = 1234;
assembly {
if not(iszero(mod(value, 2))) {
branch_taken_asm := 1
}
}
if( (value % 2) != 0 ) {
branch_taken_sol = 1;
}
return (branch_taken_asm, branch_taken_sol);
}
}
Gets compiled to:
v---- setup variable 1234
094 PUSH1 00 <--- useless
096 PUSH2 04d2
099 SWAP1 <-- useless
100 POP <-- useless
101 PUSH1 02
103 DUP2
104 MOD <-- 1234 % 2
105 ISZERO <-- == 0
106 NOT <-- (1234 % 2) != 0
v--- compiler
107 ISZERO <--- WHY......... ..... ...
108 PUSH1 73 (115)
110 JUMPI
^--- compiler
111 PUSH1 01
113 SWAP3 <-- set `branch_taken_sol = 1`
114 POP
115 JUMPDEST
116 PUSH1 00
118 PUSH1 02
120 DUP3
121 DUP2
v--- compiler 'is argument zero check'
122 PUSH1 7e (126)
124 JUMPI
125 INVALID
126 JUMPDEST
^--- compiler 'is argument zero check'
v--- optimised version of '(a % b) != 0'
127 MOD
128 EQ
129 PUSH1 88 (136)
131 JUMPI
^--- optimised version of '(a % b) != 0'
132 PUSH1 01
134 SWAP2 <-- set `branch_taken_sol = 1`
135 POP
136 JUMPDEST
So, the compiler is inserting an additional ISZERO, which negates whether or not the branch should be executed for the Jul code:
assembly {
if not(iszero(mod(value, 2))) {
branch_taken_asm := 1
}
}
The code generator which inserts the extra ISZERO is: https://github.com/ethereum/solidity/blob/4f3b7b232b375dc4d8134eabc839c30b981adf8d/libyul/backends/evm/EVMCodeTransform.cpp#L414
I think - ok, I can just remove the not command, and the jump will be taken and (correctly) avoid the piece of code which sets branch_taken_asm := 1... right?
No, that's why it's a compiler bug, if I remove the not, so it's just iszero(mod(value, 2)), the emitted code for that branch is (again, compiled without optimisations):
094 PUSH1 00
096 PUSH2 04d2
099 SWAP1
100 POP
101 PUSH1 02
103 DUP2
104 MOD
105 PUSH1 70
107 JUMPI
It's now removing my ISZERO instruction, which is OK because a%b !=0 is equivalent to iszero(mod(value, 2)) when used as the condition argument to JUMPI, so the chunk of code isn't skipped.
The two ISZERO statements are removed by this peephole optimizer: https://github.com/ethereum/solidity/blob/30da62aa2c16fd4a8592fd0c01f17cfa715e333c/libevmasm/PeepholeOptimiser.cpp#L208
Lets try something which is a more literal translation:
assembly {
if not(eq(mod(value, 2), 0)) {
This gets compiled to:
094 PUSH1 00
096 PUSH2 04d2
099 SWAP1
100 POP
101 PUSH1 00
103 PUSH1 02
105 DUP3
106 MOD
107 EQ
108 NOT
109 ISZERO
110 PUSH1 75
112 JUMPI
The extra ISZERO opcode the compiler inserts inverts the condition, meaning the jump isn't taken when it should be. If you added another NOT before the ISZERO is inserted, or just removed the NOT opcode from my statement, then the compiled code would work as intended - when a%b == 0 the piece of code is skipped.
If I change the condition to:
assembly {
if gt(mod(value, 2), 0) {
Then the branch is correctly taken, subsequently skipping the piece of code which should only be executed if value is odd.
101 PUSH1 00
103 PUSH1 02
105 DUP3
106 MOD
107 GT
108 ISZERO
109 PUSH1 74
The compiler is still bugged though, it seems that always inserting ISZERO after a jump condition has unintended consequences at:
Given that we're compiling Jul code, I'd have thought that the arguments to my if statement would be passed directly as the conditional flag to JUMPI, after negating it so the inverse of whatever my statement is causes the jump to happen.
In this case NOT ISZERO or even just NOT can be used to replace ISZERO, which will fix the bugs in the code above.
With optimisations enabled, even more weird stuff happens, it removes conditions which can be analysed as being true at runtime (e.g. that 1234 is even) and emits code accordingly - however, all of this stuff is happening to the compiler before it's optimised, so the optimiser then goes ahead and does its job correctly - and gets the same result - but the wrong one because the code generator before it fubar'd it up
Regarding:
v--- compiler
107 ISZERO <--- WHY......... ..... ...
108 PUSH1 73 (115)
110 JUMPI
^--- compiler
That ISZERO checks whether the condition in the if is false and in that case jumps over the body of the if - that seems correct to me. Note that JUMPI jumps if the argument is != 0.
I think one of your actual problem is that you use not in the if condition, while you actually want to use iszero. Note that not is bitwise negation, whereas iszero is logical negation.
So looking at the following:
101 PUSH1 02
103 DUP2
104 MOD <-- 1234 % 2
105 ISZERO <-- == 0
106 NOT <-- (1234 % 2) != 0
The last comment is incorrect - this won't be (1234 % 2) != 0, but:
~0 (i.e. all bits set) in case (1234 % 2) != 0~1 (i.e. all bits except the least significant bit set) in case (1234 % 2) == 0In any case, on a first glance to me what the compiler does looks correct in all cases - it has to (and correctly does) invert the condition in the if to decide when to jump over the if body.
Do you agree with that given the fact that the jump is over the if body, not to the if body? If not, then let me know and I'll have another more detailed look.
Apologies for the long dump of text.
Yes, it is trying to jump over the loop, to avoid executing the piece of code.
Do you agree that the following statements are equivalent, when value=1234 ?
if not(iszero(mod(value, 2))) {if not(eq(mod(value, 2), 0)) {if gt(mod(value, 2), 0) {To this Solidity:
if( (value % 2) != 0 ) {The documentation for the JUMPI instruction is: jump to label if cond is nonzero
The result needs inverting to skip the body of the if statement, which the compiler appears to be trying to do, however the body of the if statement is still being executed in two out of three times.
If I construct the branch manually to avoid the piece of code, removing the not from the first two, and adding a not to the third (using gt) it works as expected.
NOT(NOT(0)) is 0NOT(0) is non-zeroISZERO(0) is 1NOT(ISZERO(0)) is zeroSee:
Whereas ISZERO uses the sign-bit to determine if it's non-zero, so -1 is zero, and 1 is not zero:
And JUMPI checks the sign-bit is non-zero:
The problem is the compiler.
ISZERO(NOT(0)) will be considered FALSE, because ISZERO checks if .Sign() > 0.
Whereas JUMPI checks that .Sign() != 0
Meaning code that works with plain EVM, is broken by the semantics of the compilers choice of ISZERO rather than NOT to invert the result of a conditional - which goes against what the spec says Jumps should be doing.
Using ISZERO, and subsequently verifying that .Sign() > 0 is different to every other programming language I can think of which treat -1 and 1 as both being True for a conditional.
which goes against what the spec says Jumps should be doing.
Which spec are you talking about? Geth?
The yellow paper states the following:
Conditionally alter the program counter.
J_JUMPI(μ) ≡ {
μ_s[0] if μ_s[1] != 0
μ_pc+ 1 otherwise
}
Thus, the compiler behavior looks correct to me as well.
I think the main misunderstanding is still the semantics of NOT and ISZERO (see the yellowpaper https://ethereum.github.io/yellowpaper/paper.pdf page 29 for reference).
I'm not sure where you get the idea that either ISZERO or JUMPI check something like .Sign() != 0. The semantics for both are clearly specified in the yellow paper as "jumping if the argument is not zero" for JUMPI (as @leonardoalt pointed out), and for ISZERO(x) as ISZERO(0) = 1 and ISZERO(x) = 0, if x != 0.
So logical negation is in fact ISZERO and not NOT.
On the other hand NOT is bitwise negation, i.e. it will flip every one of the 256 bits in its input. That's of course not the same as logical negation.
So NOT(ISZERO(0)) correctly is not 0. ISZERO(0) is 1 and NOT(1) is the bitwise negation of 1, i.e. 0xfffff....fffffe. What you probably meant was in fact ISZERO(ISZERO(0)) and not NOT(ISZERO(0)).
So in your examples, your usage of NOT assumes an incorrect semantics for NOT.
Given the specification in the yellowpaper, can we agree on that part :-)? If we are clear on this issue, then the rest should fall into place and it should become clear that e.g.
not(iszero(mod(value, 2)))
is rightfully not equivalent to
(value % 2) != 0
.
So the issue is rather that it might have been better to call the NOT opcode BITNOT instead to avoid this kind of confusion - but as far as I can see, the compiler behavior is still correct with respect to the specified semantics of the EVM.
major facepalm
I'm just looking at https://github.com/ethereum/go-ethereum/blob/7504dbd6eb3f62371f86b06b03ffd665690951f2/core/vm/instructions.go#L255 - that is in fact written in a quite confusing way and it really looks like it checks the sign bit :-). But since in go's math/big, Sign is specified like this here: https://golang.org/pkg/math/big/#Int.Sign and x is an unsigned integer, x.Sign() > 0 is in fact x != 0... and consequently, as expected, geth follows the same EVM semantics as we do ;-).
But yeah, I can understand that this is confusing and it took some getting used to ISZERO being logical negation for myself as well :-).
Most helpful comment
major facepalm