Solidity: Allow `immutable` variables to be set in the constructor

Created on 5 Apr 2018  Â·  42Comments  Â·  Source: ethereum/solidity

This request extends #715, and #3356.

Motivation:

It's very common in Solidity contracts to include storage variables which are never intended to be mutated. An example of this is decimals in EIP20. (examples: consensys, Zeppelin)

The decimals value should never change, but no one declares it as constant because they want to be able to set the value in the constructor.

This is both less safe, and more expensive because the values need to be written to, and read from storage.

Proposal:

  • immutable values should be appended to the runtime bytecode.
  • The location of the values should be stored in bytecode at compile time
  • Where constant values are currently placed on the stack using PUSH_, immutable values can be accessed using CODECOPY.

This approach has additional safety beyond compiler errors, as it's not possible to modify bytecode in the ethereum protocol.

That would look something like this:

contract Example {

    uint public constant I_AM_CONSTANT; 
    uint public immutable I_AM_IMMUTABLE;

    function Example(uint _b) public {
        I_AM_CONSTANT = _b; // TypeError: Cannot assign to a constant variable.
        I_AM_IMMUTABLE = _b; 
    }

}

Implementation notes:

  • [x] parse additional state variable modifier "immutable" and add to AST — @chriseth -> #8444
  • [x] check that the type is a value type — @chriseth
  • [x] determine call graph: Which functions are only used in construction context and which functions are reachable from external functions? — @marenz #8475

    • properly handle virtual function calls

    • calls from base contructors

    • calls in member-initializations.

    • check that immutable variables are only assigned to exactly once (also not zero times) in construction context (also check the point of declaration)

    • check that immutable variables are not read from in functions called from the constructor

    • it might be necessary to re-check the above conditions for all contracts derived from a contract that defines immutable state variables because of virtual functions.

  • [x] since assignment might happen way before actual deployment, it might be a good idea to reserve a fixed spot in memory for each immutable variable, copy there upon assignment (this might also allow multi-assignment) and then either copy to each location in bytecode before returning or use codecopy routines.
  • [x] determine positions in bytecode where immutable variables are read - this probably requires introducing a new AssemblyItem type. For the yul code generator, this could be a new builtin function with literal argument.
  • [ ] codegen for yul
  • [x] debug output (also part of metadata)
  • [x] public immutables
  • [x] documentation
  • [ ] external function pointers
feature language design

Most helpful comment

This is fully implemented now.

All 42 comments

The immutable keyword fits because we also plan to use it for function parameters which cannot be changed inside the function (but of course the function can be called with different arguments).

Two problems that still need to be solved are:

  • what happens if an immutable value is read by (a function called by) the constructor?
  • what happens if an immutable value is written to multiple times in (a function called by) the constructor?

A solution for both might be to provide a fixed slot in memory for the variable which is then stored in code right before the code deploy step, i.e. an immutable variable behaves completely different at construction time than at runtime (which is to be expected).

I think that the second scenario should revert. Immutable vars should be initializable only once.

Also re-reading this made me think if we can extend the immutable definition to encompass public function parameters as well and these would not get copied into memory and would get read directly from CALLDATA.

EDIT: in the case the public function is called externally and the parameters live in the call data, of course.

@GNSPS public function arguments have to live in memory, otherwise it is not possible to call it internally. This is different for external functions. The compiler might suggest to make a function external if it is never called internally.

Oh and I don't think it should revert, it should just not compile if you assign an immutable variable twice.

Oh shoot! Of course! My brain stopped, "not compile" is what I meant, "revert"ing makes no sense in a compile-time mechanism. 😂

In my mind, this immutable parameter thing was for the different behavior of public functions when called internally and externally but I can see how this would be so much more confusing and not save that much gas. Could just be juxtaposition of external and public functions.

This feature would be super nice to have imo. Especially with the istanbul increase of SLOADS, I think this can result in significant optimization gains.
If the questions above are still open, here's my 2 cents:

  • Only allow assigning to the immutable value once.
  • Ideally, the semantics would be that if a function is called which reads the immutable value before the assignment, refuse to compile.

If that turns out to be too complex since the immutable could come from memory or the bytecode, then to not allow for functions which depend on the immutable variable in the constructor seems like a good enough start!

Discussion: We might want to implement this soonish, maybe with immutable for now, but we could also combine it with the constant keyword in the future.

I propose there is an error in the test case. Should be:

contract Example {

    uint public constant I_AM_CONSTANT; 
    uint public immutable I_AM_IMMUTABLE;

    function Example(uint _b) public { // WARNING: This function can be changed to "pure" mutability
        I_AM_CONSTANT = _b; // TypeError: Cannot assign to a constant variable.
        I_AM_IMMUTABLE = _b; 
    }

}

Warning added inline

Nowadays it should rather be constructor instead of function Example, shouldn't it?

Re-added "to discuss" because we need to agree about the problems related to reading the variable before its final value has been set.

Another thing to note: This should only work for value types or at most for types that are statically-sized. In that case, we can reserve a certain spot in the deployed bytecode, do not have to move code around and can keep the metadata cbor struct at the very end.

Another complication is the order of execution of constructors in the inheritance hierarchy. Because of that, I would also say that accessing a non-initialized immutable should result in a compiler error.

Here is a blunt way to get this to work in the next version, with further improvement possible later:

  • Accessing a property defined immutable as an r-value inside that contract's constructor shall be a compile error.

If r-value in that constructor is necessary then a workaround exists, the developer can use a temporary variable in the constructor.

From my understanding, this means it'll be possible to read from immutable variables in external pure functions. Is this correct?

At least my first instinct would be to consider reading from immutable variables a view operation, not a pure one...

pure. For sure. @nventuro, thank you updated that test case above.

Why would you say it's definitely pure? In what way is this "purer" than this (which is view)? In my mind pure basically means compile-time constant and that's not the case here...

As an example:

contract C {
  uint public immutable IMMUTABLE;
  uint public immutable IMMUTABLE_NOT_WRITTEN_IN_CONSTRUCTOR = 23;
  constructor(uint v) { IMMUTABLE = v; }
  function PURE() public pure returns(uint256) { return 42; }

  uint256[PURE()] x; // This could and should be valid at some point.
  uint256[IMMUTABLE] y;  // This will never work.
  uint256[IMMUTABLE_NOT_WRITTEN_IN_CONSTRUCTOR] z; // This again is fine.
}

So I'd argue that writing to an immutable variable in the constructor demotes reading it from pure to view...

I think we have an issue about this which had a more fine-grained categorization of "constantness", but I would also go for "view" in this case.

pure means it is compatible with STATICCALL. The immutable feature is compatible with STATICCALL so it should be pure.

Example implementation and usage:

Solidity code:

contract Token {
    string immutable SYMBOL;

    function constructor(string newSymbol) public {
        SYMBOL = newSymbol;
    }
    function getSymbol() public pure returns (string) {
        return SYMBOL;
    }
 }

Example init code:

stuff
put SYMBOL here
stuff
load the constructor parameter into SYMBOL
stuff

Example deployed code for a contract constructed with newSymbol = "Bob":

function selector jump table
entry point for function getSymbol
  return string literal "Bob"
stuff
stuff

Example usage of deployed contract from accessor contract:

  - To some Token contract
  - function selector = getSymbol

^^ Here the deployed contract is able to return the "Bob" literal without accessing storage. This means the getSymbol function is pure.

view already means compatible with STATICCALL (in fact we do call everything view with staticcall). And the immutable feature is definitely compatible with staticcall, so it definitely is at least view, there's no arguing that.

The problem is (as @chriseth has hinted at) that there are two interpretations of pure, one stricter than the other:

  • the function does not write/modify and not even read from storage or from any other blockchain state
  • the function is compile-time constant or alternatively: a pure function is solely a function of its input arguments, i.e. it will always result in the same values for the same inputs.

While the immutable feature arguably fits the first interpretation (depending on what exactly we consider "state"), it doesn't fit the second one. This either means that we need to define something beyond pure that fits the second bullet point, or that immutable variables (when written to in the constructor) are only view.

For comparison: reading the address of this doesn't access storage either - but it's still view not pure. So pure doesn't merely promise not to read from storage, but rather not to read any state and I'd consider the value assigned to immutable variables as part of "state".

On the other hand we - to my own surprise - currently allow the codecopy opcode in pure functions - but I'd actually consider that a bug.

In any case - I think we can agree that immutable variable are more than view - and we can have a separate discussion about the future of pure (i.e. as I hinted at, we have the long-term plan to allow pure expressions e.g. for array lengths of state variables, which would be prevented by considering immutable variables assigned to in constructors as pure) to decide whether it's sufficiently beyond view to call it pure :-).

I'd suggest to move further discussion about pure itself away from this issue towards https://github.com/ethereum/solidity/issues/8153 .

To simplify the implementation, I would say that immutable variables cannot be read in the construction context and they can only be written to once. This includes direct assignment at the point of definition (immutable address me = address(this)).

It might still be a bit tricky to determine these, so maybe we can even strengthen that to: immutable variables can only be written to in the constructor itself and they cannot be read from in any function that is called from the constructor.

Even with that, "called from the constructor" might be tricky, because of inheritance and overriding functions changing the call patterns. But anyway, in the worst case, for the first implementation, we can detect these cases at code creation time.

immutable variables cannot be read in the construction context

I agree, though I hope to see that functionality in the long-run. Until then it should be easy enough to substitute stack variables or calldata for that purpose.

Illustration:

contract Liquidator {
  ERC20 public immutable inputToken;
  ERC20 public immutable outputToken;
  UniswapV2 public immutable exchange;

  constructor(ERC20 input, ERC20 output, UniswapFactory exchangeFactory) {
    inputToken = input;
    outputToken = output;
    exchange = exchangeFactory.getExchange(inputToken, outputToken); // can be getExchange(input, output)
  }
}

Sorry I was way off in my explanation above. Thanks for the correction.

Same conclusion though, immutable must be pure. If an immutable causes an SLOAD (which just got a lot more expensive) then we are doing little of value here.

@chriseth / "called from the constructor"

Even with that, "called from the constructor" might be tricky...

I think this is easy — any use of the immutable RVALUE from functions is compiled to 0xfe in the construction context.

All LVALUE use must be in the constructor.

Constructor can only set LVALUE once.

Also, there should be no direct assignment of constants (i.e. in the contract context).

immutable address me = address(this)

The only useful case would be address(this) and other chain data. This is limited utility.

Strongest implementation NEW

  • Rename immutable to plastic. (A definition of plasticity is: formable until hardened.)

  • The plastic values can written to zero or more times from the constructor function.

  • The constructor reserves memory locations a...b for plastic variables. (These are defined at compile-time.)

  • Writing to a plastic LVALUE is always disabled for functions.

  • Writing to a plastic LVALUE is allowed for constructor and this is achieved by writing to those compile-time defined memory locations.

  • Reading from a plastic RVALUE in the construction context (constructor and other functions) uses those those compile-time defined memory locations.

  • When construction completes, the plastic values are inlined to every RVALUE site (as 0x60 PUSH1 commands...)

Alternatively, those plastics can be stored in code and accessed using 0x39 CODECOPY. I'm not sure how constant is currently implemented. But plastic and constant should be the some. And of course inlining is better runtime performance than CODECOPY for single-word variables.

@wjmelements @fulldecent sorry, to clarify: In my comment above I was talking about the first implementation, we will improve on that later. But for now, this is what we an do as a reasonable first step.

@fulldecent Why do you think address immutable me = address(this) should not be allowed in the contract context?

Directly copying to all locations in code where it is used sounds like a very good idea I did not think of that! For that, we would need all locations to use push32, though (maybe smaller for smaller types). Again, for the initial implementation, this is too complicated, though.

The only usable one that would be your example:

contract Liquidator {
    address payable plastic me = address(this);
    function getMe() external returns (address) {
        return me;
    }
}

But actually that is not very helpful because address(this) is already available.

In general, there are two other ways to use contract context:

contract Liquidator {
    uint payable plastic totalSupply = 1000;
}

Entirely useless because that could be constant.

And lastly the only other way is:

contract Liquidator {
    uint payable plastic number = block.number;
}

This last example has a risk that it is misunderstood. A programmer may think that the current block number will be returned.

The alternate way of having that code in the constructor is much more clear and explicit that it is only run once.

My 2 cents:

On naming, I'd avoid the keyword immutable, since it's often associated with immutable data structures (see immutable.js or data structures in functional languages). Since this is a concept that already exists in other languages, I'd suggest reusing the same keywords that already exist to denote it, such as readonly or final, following Java or C#.

But actually that is not very helpful because address(this) is already available.

Disagree. Being able to pin down address(this) in construction time is particularly useful for detecting when code is being run from a DELEGATECALL or not: if address(this) is different to the readonly me value, then it's a delegate call. I understand this is the same mechanism used by Solidity libraries under the hood, and it would be super useful for implementation contracts in the context of proxies.

Mind you it exists in D also as immutable...

Mind you it exists in D also as immutable...

You learn something new every day. I'm fine with immutable as well then!

@spalladino and I agree that being able to record address(this) immutably at construction time is useful.

My argument is only that the syntax to achieve this should be:

contract Liquidator {
    address payable plastic me;
    function constructor() public returns {
        me = address(this);
    }
}

rather than:

contract Liquidator {
    address payable plastic me = address(this);
}

because the latter format may confuse somebody if they see:

contract Liquidator {
    address payable plastic me = msg.sender;
}

Directly copying to all locations in code where it is used sounds like a very good idea I did not think of that! For that, we would need all locations to use push32, though (maybe smaller for smaller types). Again, for the initial implementation, this is too complicated, though.

I like this idea! Is there some simpler mechanism that would be viable for an initial release though? We're redesigning a core feature in a way that could make great use of immutable, but are currently blocked by the fact that using these values in the constructor is very cumbersome (due to the need to declare a new local variable).

It'd be great if we had the ability to read from immutable values in 0.6.3 (or whenever this is released), even if just in some limited capacity, like only in the constructor body and not on functions called from there, and only when using uint256 or bytes32 values. This would let people use immutable much more extensively, getting more value out of the work put on this feature.

As someone who is going to have to support this, I feel like I should jump in here to note that the put-it-directly-in-the-bytecode-where-it's-used approach is going to be extremely difficult to support for debugging/decoding purposes without some sort of additional debugging data information!

@haltman-at we can easily provide that.

@haltman-at we can easily provide that.

OK, thank you!

@haltman-at what is the additional info you need? A mapping from each immutable's AST id to its byte offsets in the code?

That would suffice, yes. Thank you!

Thought for the future: It would be nice to have something similar for constant variables as well. Of course, right now that's not possible because they don't work that way -- the whole code for them gets pasted into the bytecode; for this to work, they'd have to be actually evaluated at compile time so that just the result is pasted in. But that's something you've actually been planning for a while, isn't it? (Or at least to some extent? I guess it would be quite troublesome with reference types.) Well, if that ever happens, knowing the bytecode offsets where the value gets pasted in could be helpful.

If we want to allow immutable state variables for libraries, then we have to allow them to have a constructor.

This is fully implemented now.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

chriseth picture chriseth  Â·  3Comments

kkagill picture kkagill  Â·  4Comments

chriseth picture chriseth  Â·  3Comments

walter-weinmann picture walter-weinmann  Â·  4Comments

hiqua picture hiqua  Â·  4Comments