Solidity: Optimized Compiler failing to optimize simple functions

Created on 24 Feb 2019  路  20Comments  路  Source: ethereum/solidity

Description

solc 0.5.4 with optimizer runs is failing to properly optimize simple functions such as those in SafeMath as compared to using assembly.

See https://github.com/OpenZeppelin/openzeppelin-solidity/pull/1635

Environment

  • Compiler version: 0.5.4 (Optimizer with at least 200 runs)
  • Framework/IDE (e.g. Truffle or Remix): both Remix (200 runs) and Truffle (10000 runs)

Most helpful comment

From https://github.com/OpenZeppelin/openzeppelin-solidity/pull/1678 (I didn't want to creat a new issue, since the topic is the same)

As shown by @wjmelements here, when compiling on both v0.4.24 and v0.5.5, Solidity seems to emit less efficient bytecode when a returned variable is declared inside a function, as opposed to the return value being named in the function declaration and assigned to):

// This one costs more gas to call
function add1(uint256 a, uint256 b) internal pure returns (uint256) {
    uint256 c = a + b;
    require(c >= a);

    return c;
}

// This one is cheaper to call
function add2(uint256 a, uint256 b) internal pure returns (uint256 c) {
    c = a + b;
    require(c >= a);
}

The difference is not very large (around 400 and 800 gas, depending on the compiler version, when calling the function in question 100 times, so a 1% gas difference for a transaction that only calls said function in a loop), but it may still be worthwhile looking into to find out where this discrepancy comes from.

All 20 comments

Could you be more specific? What is your expectation and what does the compiler do?

I'm not sure exactly what the compiler output is in either case, but

function div(uint256 a, uint256 b) internal pure returns (uint256) {
    require(b > 0);
    uint256 c = a / b;
    return c;
}

is using 38 more gas than:

function div(uint256 a, uint256 b) internal pure returns (uint256 r) {
    assembly {
        if iszero(b) { revert(0, 0) }
        r := div(a, b)
   }
}

There are more examples in the link given. Almost all SafeMath functions are not compiling in an optimal way

Unoptimised:

   div(uint256,uint256):    407
   div2(uint256,uint256):   343

Optimised:

   div(uint256,uint256):    328
   div2(uint256,uint256):   268

Where div2 is the hand written assembly and both are public functions in a contract.

The unoptimised output:

        /* "opt.sol":140:292  function div2(uint256 a, uint256 b) pure public returns (uint256 r) {... */
    tag_7:
        /* "opt.sol":197:206  uint256 r */
      0x00
        /* "opt.sol":243:244  b */
      dup2
        /* "opt.sol":236:245  iszero(b) */
      iszero
        /* "opt.sol":233:235  if */
      iszero
      tag_12
      jumpi
        /* "opt.sol":258:259  0 */
      0x00
        /* "opt.sol":255:256  0 */
      dup1
        /* "opt.sol":248:260  revert(0, 0) */
      revert
        /* "opt.sol":233:235  if */
    tag_12:
        /* "opt.sol":283:284  b */
      dup2
        /* "opt.sol":280:281  a */
      dup4
        /* "opt.sol":276:285  div(a, b) */
      div
        /* "opt.sol":271:285  r := div(a, b) */
      swap1
      pop
        /* "opt.sol":223:290  {... */
      swap3
      swap2
      pop
      pop
      jump  // out
        /* "opt.sol":13:138  function div(uint256 a, uint256 b) pure public returns (uint256) {... */
    tag_10:
        /* "opt.sol":69:76  uint256 */
      0x00
        /* "opt.sol":96:97  0 */
      dup1
        /* "opt.sol":92:93  b */
      dup3
        /* "opt.sol":92:97  b > 0 */
      gt
        /* "opt.sol":84:98  require(b > 0) */
      iszero
      iszero
      tag_14
      jumpi
      0x00
      dup1
      revert
    tag_14:
        /* "opt.sol":104:113  uint256 c */
      0x00
        /* "opt.sol":120:121  b */
      dup3
        /* "opt.sol":116:117  a */
      dup5
        /* "opt.sol":116:121  a / b */
      dup2
      iszero
      iszero
      tag_15
      jumpi
      invalid
    tag_15:
      div
        /* "opt.sol":104:121  uint256 c = a / b */
      swap1
      pop
        /* "opt.sol":134:135  c */
      dup1
        /* "opt.sol":127:135  return c */
      swap2
      pop
      pop
        /* "opt.sol":13:138  function div(uint256 a, uint256 b) pure public returns (uint256) {... */
      swap3
      swap2
      pop
      pop
      jump  // out

The main difference I see is the check for division-by-zero, which you are missing in the hand written version.

We could consider tracking value ranges of variables and due to the require statement the code generator could avoid adding the check again. We do this in the SMTChecker but that state is not used anywhere else.

Btw, what is the point of div in SafeMath? The compiler generates runtime assertion for division by zero since mid-2016.

The newest SafeMath prefers to generate the revert opcode instead of the invalid opcode. The previous version of div() in SafeMath did not do anything besides return a/b. I didn't realize that the invalid check was created by the compiler and not the EVM (which will output 0 in the case of a zero divisor)

Okay it makes sense that the hand-written version is less gas then. One possible compiler optimization that stands out is optimizing to a>0 be iszero(iszero(a)) instead of iszero(iszero(gt(a,0))) for any unsigned types.

But that's sort of a separate issue, feel free to close this one if it is no longer helpful for you

iszero(iszero(gt(a,0)))

Should be picked up by the optimiser and replaced with gt(a,0) according to the rule list.

Update: the optimiser picks it up correctly.

I think we can add the following new optimiser rules:

  • lt(x, 0) -> 0 -- since this cannot ever be true, slt can be used for two's complement <0 checks
  • gt(x, 0) -> iszero(iszero(x)) -- same here, there's sgt for signed comparison
  • gt(x, ~0)) -> 0

The middle rule makes sense because iszero(iszero(iszero(iszero(x)))) will be picked up and reduced into iszero(iszero(x)).

What are the gas costs of just using division in Solidity compared to the assembly?

Value range / condition tracking has been on the roadmap, and I think for the yul optimizer we should really implement it. It should also not be that difficult because we already track the values of variables.

Not sure if this is the case, but is the sequence

iszero
tag_#
jumpi

reducable to simply

tag_#
jumpi

since jumpi as I understand it already checks for a non-zero value

That is true, but I think our current framework can only optimize expressions and not statements. We could add this to the peephole optimizer, though.

We could add this to the peephole optimizer, though.

I think that sounds good.

From https://github.com/OpenZeppelin/openzeppelin-solidity/pull/1678 (I didn't want to creat a new issue, since the topic is the same)

As shown by @wjmelements here, when compiling on both v0.4.24 and v0.5.5, Solidity seems to emit less efficient bytecode when a returned variable is declared inside a function, as opposed to the return value being named in the function declaration and assigned to):

// This one costs more gas to call
function add1(uint256 a, uint256 b) internal pure returns (uint256) {
    uint256 c = a + b;
    require(c >= a);

    return c;
}

// This one is cheaper to call
function add2(uint256 a, uint256 b) internal pure returns (uint256 c) {
    c = a + b;
    require(c >= a);
}

The difference is not very large (around 400 and 800 gas, depending on the compiler version, when calling the function in question 100 times, so a 1% gas difference for a transaction that only calls said function in a loop), but it may still be worthwhile looking into to find out where this discrepancy comes from.

The difference seemse to be 5 gas for optimized code and 13 for unoptimized code (which I would not consider here).

> solc --gas --optimize /tmp/x.sol  
======= /tmp/x.sol:C1 =======
Gas estimation:
construction:
   87 + 32600 = 32687
external:
   add1(uint256,uint256):   277

======= /tmp/x.sol:C2 =======
Gas estimation:
construction:
   81 + 32000 = 32081
external:
   add2(uint256,uint256):   272
> solc --gas  /tmp/x.sol  
======= /tmp/x.sol:C1 =======
Gas estimation:
construction:
   87 + 38400 = 38487
external:
   add1(uint256,uint256):   352

======= /tmp/x.sol:C2 =======
Gas estimation:
construction:
   87 + 37400 = 37487
external:
   add2(uint256,uint256):   339

Since the return variable is an actual variable that can be referenced, it occupies a stack slot. The old optimizer (the new cannot yet be used here) is based on basic blocks and it only optimizes what happens to the stack from block start to block end - it will not modify the stack layout between blocks and thus it cannot remove this stack slot.

I'm closing this now, since everything discussed here that can be implemented with reasonable effort has been implemented. Value range tracking is on the roadmap for the yul optimizer: #6252

Ah, that makes sense then, considering how the optimizer works. Will the Yul optimizer be able to modify the stack layout across blocks?

Also, is there a rough estimate for when you'd expect solc to be able to compile to Yul to make use if its optimizer?

Thanks!

Yes, the yul optimizer was designed to allow cross-block and even cross-function optimizations. The traditional optimizer is a little better on optimizing stack layout, but I think we will get there also with the yul optimizer. I cannot really tell how solidity -> yul will progress, but we currently estimate end of the year for a usable prototype.

Here is the first PR on this road: https://github.com/ethereum/solidity/pull/2320

Was this page helpful?
0 / 5 - 0 ratings