Vyper: VIP: Cost and usability Improvements to internal function calls

Created on 16 Jun 2018  路  52Comments  路  Source: vyperlang/vyper

Preamble

VIP: <to be assigned>
Title: reduce the cost and difficulty of code reuse in common situations
Author: maurelian
Status: Draft
Created: 2018-06-16
Requires (*optional): <VIP number(s)>
Replaces (*optional): <VIP number(s)>

Simple Summary

  1. I shouldn't need to copy and paste the same code snippets into 5 functions to valid authorization.
  2. I shouldn't feel like copying and pasting code is more efficient or cheaper.

Abstract

In order to call a within the same contract function, Vyper uses the CALL opcode, which send a new message to call functions within the same contract. This has some nice safety benefits.

Unfortunately it will result in developers using anti-patterns to save on gas, or access msg.sender in a function call.

Motivation

The benefit of using CALL to access code in the same contract is to create a new execution context, with no risk of side effects from memory access. I admit that I find this to be quite an elegant use of the EVM for safety.

Problems with using CALL

Unfortunately, there are two major issues with the use of the CALL opcode for calling functions within the contract:

1. Environment values change unexpectedly

At least two important environment opcodes will return different values: CALLER (ie. msg.sender) and CALLVALUE (ie. msg.value). This complicates the use of functions for permission checking. The following is a naive vyper translation of the common "ownable" pattern for a simple storage solidity contract.

owner: address
value: public(int128)

@public
def __init__():
    self.owner = msg.sender

@private
def checkOwner():
  assert msg.sender == self.owner # this will REVERT every time

@public
def writeValue(_value: int128):
    self.checkOwner()
    self.value = _value

2. Gas costs disincentivizes code reuse

The CALL opcode costs 700 gas (+ input data costs + function dispatching again). By contrast Solidity uses JUMP (2 gas) to call a function.

This imposes a large penalty on code reuse. Regardless Vyper's design philosophy favoring safety over gas efficiency, developers will respond to this incentive by simply copying and pasting boilerplate code. This is dangerous, and difficult to audit.

Specification:

I'm sorry, I don't have one specification. Below are some options I see for mitigating these issues.

From the Motivation section above, we have 3 parameters along which to analyze approaches to calling a function in the EVM.

  1. Gas cost
  2. Preservation of env variables
  3. Safety benefits

1. Use DELEGATECALL instead of CALL

  1. Gas cost: NEUTRAL (same as using CALL)
  2. Env variables: GOOD (gives us back CALLER and CALLVALUE)
  3. Memory Safety: GOOD. It does indeed reset memory.

Seems like a decent solution!

~3. Memory Safety: BAD (executes the code in the same memory context, nullifying reasons for using CALL in the first place)~ ~I proposed this earlier, but I don't think this is a good solution. ~

2. Just use JUMP like Solidity

  1. Gas cost: GOOD (Much lower gas cost)
  2. Env variables: GOOD (Preserves CALLER and CALLVALUE)
  3. Memory Safety: BAD (Greater risk of compiler bugs)

This is worth considering.
See also #330.

3. Implement a safer analog to Solidity's modifiers

These could have significant restrictions on them, like no access to MSTORE or SSTORE, and not accepting arguments. I assume this would be done by with JUMP, or just inlining the same code each time it's needed.

  1. Gas cost: GOOD (Lower execution cost, but might increase size of contract if inlining is used)
  2. Env variables: GOOD (Preserves CALLER and CALLVALUE)
  3. Memory Safety: Depends on implementation

4. Create special variables which persist across message calls

Built-in vars like caller and callvalue can be created, which are magically passed to a function in a message call.

My code above would thus be:

@private
def checkOwner():
  assert _sender == self.owner # <- replaced msg.sender with a magical global variable `sender`

@public
def writeValue(_value: int128):
    self.checkOwner)
    self.value = _value

This feels a bit funky to me. Probably would be hard to implement safely, and obscures the true functioning of the EVM.

5. Status Quo

My problem in the code sample above can be addressed by passing the values I need as arguments:

@private
def checkOwner(_sender: address):
  assert _sender == self.owner # <- replaced msg.sender with _sender

@public
def writeValue(_value: int128):
    self.checkOwner(msg.sender) # <- pass msg.sender as an argument 
    self.value = _value

Maybe that's OK? But it requires a deeper understanding of how both the EVM and Vyper work.

Backwards Compatibility

Dependent on approach.

Copyright

Copyright and related rights waived via CC0

Approved Discussion

Most helpful comment

Also note, I think we should split this issue into two changes/ideas.

1.) Using jump table mechanics to improve gas on internal function calls.
2.) Getting msg.sender to equal the calling contract value.

All 52 comments

I do find myself avoiding internal calls in favor of reusing the same code in multiple places to save gas, which is probably not ideal. I don't personally know the tradeoffs well but If this can be done safely I'm all for it.

I get your point but I don't fully agree with that. Unlike other software, smart contracts' code must be explicit and easily readable. In most other cases, it's more important to not repeat yourself in the code but in smart contracts, I think readability of the function should be more important.
In the example about you are trying to mitigate the behavior of modifiers which Vyper intentionally excluded.

Modifiers - e.g. in Solidity you can do function foo() mod1 { ... }, where mod1 can be defined
elsewhere in the code to include a check that is done before execution, a check that is done after
execution, some state changes, or possibly other things. Vyper does not have this, because it makes > it too easy to write misleading code. mod1 just looks too innocuous for something that could add
arbitrary pre-conditions, post-conditions or state changes. Also, it encourages people to write code > where the execution jumps around the file, harming auditability. The usual use case for a modifier is > something that performs a single check before execution of a program; our recommendation is to
simply inline these checks as asserts.
This is the reason Vyper don't have modifiers like Solidity in the first place.

I am in favor of your fourth proposal of creating the caller and callvalue.

I would like to hear more opinions on that as code repetition is an insecure thing as well so I'm not completely sure what's more important yet, but I believe there's a room for improvement.

Idk if I agree that internal function calls have the same readability issues as Solidity modifiers. Also code repetition can feel very verbose and reduce readability. If you need to do a square root in five different functions, its probably bad to have the sqrt() implemented five times. But people will be tempted to do so if they save 1000 gas per function call. Discouragement through high gas cost makes sense on a protocol level (ie making sstore exepensive) but feels wrong for smart contract languages.

To be clear I don't think

  1. Environment values change unexpectedly
    At least two important environment opcodes will return different values: CALLER (ie. msg.sender) and CALLVALUE (ie. msg.value).

is a big problem. I do however think

  1. Gas costs disincentivizes code reuse
    The CALL opcode costs 700 gas (+ input data costs + function dispatching again). By contrast Solidity uses JUMP (2 gas) to call a function.

is bad. If using JUMP for internal calls can be done as safely as CALL, it should be used. I don't know the security tradeoffs at all, hopefully someone who does can chime in here!

It's not only about readability, but maintainability as well. More mistakes will be made if 5 pieces of code have to change.

Take a look at these @private utilities I wrote for my ERC721 translation:

https://github.com/maurelian/ethereum-erc721/blob/vyperTranslation/contracts/tokens/NFToken.v.py#L102-L120

There are a total of 12 lines of code in the bodies, which can be written once, then reused 3 times in the standard. If they were written 3 times over, it would be very hard (for my brain at least) to read carefully and catch any differences between them.

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function. This would limit functions that make external calls from having advanced memory access while allowing safe internal calls to reduced their gas count. This will also incentivize isolation/reduction of external calls to improve gas costs.

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function.

This optimization means that source code using msg.sender would behave differently based on a decision by the compiler. It might work if the developer could specify a flag (@memsafe, @pure ?), indicating they would like to use JUMP and the compiler makes a check to ensure it's safe.

To be clear I don't think "Environment values change unexpectedly" is a big problem.

@haydenadams No? IMO, the current behaviour of this code is highly non-intuitive, and will make a lot of common operations much more difficult.

```python
@public
def getCaller() -> address:
return msg.sender

@public
def alsoGetCaller() -> address:
return self.getCaller() # always returns the contract's own address!
```

@maurelian

IMO, the current behaviour of this code is highly non-intuitive, and will make a lot of common operations much more difficult.

Whatever address calls a function is msg.sender within the scope of that function. That is the only rule right now. Its not a massive leap to realize that extends to self for internal calls. When a public function is called I don't think it should react differently if its being called by itself, another smart contract, or an external account.

"on the internet nobody knows you're a dog" -> "on Ethereum nobody knows you're 潭a潭 潭s潭m潭a潭r潭t潭 潭c潭o潭n潭t潭r潭a潭c潭t潭 yourself"

With account abstraction the divide between contracts and accounts will go from negligible to nonexistent. Contracts will get more complicated but also more versatile. Its pretty important that smart contract devs understand Ethereum account structure. I would prioritize simplicity of logic over what feels immediately intuitive here. Totally understand where you're coming from though!

This seems reasonable, and personally I like it:

@private
def checkOwner(_sender: address):
  assert _sender == self.owner # <- replaced msg.sender with _sender

@public
def writeValue(_value: int128):
    self.checkOwner(msg.sender) # <- pass msg.sender as an argument 
    self.value = _value

We briefly discussed this in the bi-weekly meeting yesterday, but left it as unresolved. As it's quite a significant change, we have to study all the options carefully ;)

As it's quite a significant change, we have to study all the options carefully ;)

@jacqueswww I certainly agree.

@haydenadams it's straightforward to make small changes to my little example to make it work, but it's always going to be confusing.

Here's an other summation:

IF: a contract can call its own @public functions;
THEN: msg.sender is pretty much useless as a builtin variable, because it means something entirely different when called externally vs. internally. It's only useful when passed as an argument.

If the Vyper team decides to move towards the external and internal visibilities as in solidity, this would be fine. (added to OP)

Finally, from the Auditability principle:

Simplicity for the reader is more important than simplicity for the writer, and simplicity for readers with low prior experience with Vyper (and low prior experience with programming in general) is particularly important.

Outside of project contributors, requiring this level of nuance is far from simple for anyone reading code.

I agree that it's confusing that msg.sender changes, and will be a major annoyance going forward.
Short/little fixes I will investigate:
1.) disable accessing msg.sender or msg.value in private functions.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

Actually, I think you are right! I tested with this code in remix:

library Lib {
    function foo(uint a) returns(uint) { }
}

contract B {
    using Lib for uint; 
    function bar() returns(uint){
        uint a = 1;
        return a.foo();
    }
}

If you call B.bar(), and look at the debugger, after the DELEGATECALL instruction memory is reset.
My bad. I'll update the VIP.

@jacqueswww

1.) disable accessing msg.sender or msg.value in private functions.

I like this. It gets rid of most of the confusion, and they're both useless in private functions anyway.

@haydenadams I think "useless" might be an oversimplification, but it forces a specific design behavior that ultimately is very practical in reducing your attack surface. Basically, private functions should do very specific internal behaviors, preferably dealing only with internal state.

1.) disable accessing msg.sender or msg.value in private functions.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.

To be clear, I'm in favor of either but not both. Preference for #2.

I am also leaning towards delegate call.

But definitely feel we might need a more gas efficient dispatcher+memory stack post beta.

  1. Would be a user expectation breaking change
    So is 2.

More gas efficient dispatcher can be done without breaking expectations if we make the right move now.

Which one makes it easier to implement gas optimization on the dispatcher?

If switching to delegatecall mean msg.sender in internal calls to public functions will no longer return the contracts own address I'm strongly against it.

To better understand and make a decision here, it will be helpful if the VIP will include discussion of which stage of the compilation process we are changing. Also a discussion of use cases at that stage of compilation would help me.

Also, if you can give a brief overview of the compilation strategy for newcomers this would help them join the discussion.

@fulldecent and everyone: Looking again, I think it makes sense to reduce the scope of this VIP. A better title might be "Cost and usability Improvements to internal function calls". 馃憤/ 馃憥?

Code reuse in general, (ie. enabling imports and something like inheritance/composability) is probably another much needed VIP, or perhaps a few.

Or "New feature: call a function without using EVM CALL"

I think this is modification to existing behavior, not a new feature proposal.

We already have a composibility VIP I believe, as well as one for importing interfaces

In the mid to longer term, I can imagine an EIP to significantly reducing the cost of a contract to CALL or DELEGATECALL its own address might be well received.

I would vote against such an EIP. Call-to-self has extremely limited utility.

I believe the issue is that even calls to @private functions use CALL, which is very gas heavy compared to Solidity using JUMP. I believe this is for safety reasons but unfortunately discourages the use of @private functions.

@fubuloubu's response earlier in the thread:

6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function. This would limit functions that make external calls from having advanced memory access while allowing safe internal calls to reduced their gas count. This will also incentivize isolation/reduction of external calls to improve gas costs.

seems like something worth pursuing.

Copying from Gitter.

Will's code reuse proposal. This is a FOR NOW proposal, it is implementable and it solves nearly all problems for today. I even think it is good enough to get Vyper to production version 1.0. Further improvements are possible but I think they can safely be out of scope of 1.0:

WORK PLAN:

  1. Two types of files: interfaces (one interface per file) and contracts (one contract per file).
  2. You can define an interface like ERC721Metadata in an interface file and also in a contract file.
  3. The interface file is one-to-one fully compatible with the Ethereum ABI.
  4. Write a translator from Solidity to Vyper interfaces. It will be easy.
  5. Make a library of all the final ERC interfaces, even if you have to manually make them.
  6. Import interfaces from the interface files to contracts.
  7. An interface is a type which decorates an address.
  8. Interfaces can inherit other interfaces.
  9. Carefully study interface IDs from ERC-165 and reproduce the examples given in ERC-721 this relates to how interfaces inherit other interfaces.
  10. Interfaces may have optional functions. (A break from solidity.)
  11. Contracts can implement interfaces.
  12. A contract that implements an interface but does not implement a required function is an ERROR.
  13. A contract that implements an interface but does not implement an optional function is neither an ERROR nor a WARNING.
  14. The current behavior of using EVM CALL to call a contracts' own functions changes to to a new syntax. People will rarely want to use this.
  15. Rename @public to @external to match Solidity.
  16. Introduce a new function decorator @internal which allows a function to be called internally.
  17. Reintroduce the function call syntax currently used for external calls (removed in step 14) but have it apply to internal calls.
  18. Implement external calls like this: External jump call table -> LOADCALLDATA unpack -> add function parameters to stack -> call internal function -> do function stuff.
  19. Implement internal calls like this: add function parameters to stack -> call internal function -> do the function stuff.
  20. Remember that all of these steps are INCREASING SECURITY because: people won't use Vyper otherwise, code reuse is more secure, code is easier to audit; THEREFORE you should ignore the pleas of the formal verification people to slow down... it is THEIR job to keep up with our pace :-)

OUT OF SCOPE

  • Contracts inheriting from other contracts
  • Interface files that have built-it default implementations

^^ This is a "manifesto" if you have followed along similar initiatives in Swift evolution.

@fulldecent mostly on board

Solidity still has @public functions which can be called internally or externally.

True. I am recommend a more explicit @external + @internal to be the Vypery equivalent of Solidity's @public.

Call-to-self has extremely limited utility.

Vyper is currently using it for all self function calls. It improves memory safety. Given the value of security, it's worth consideration should Vyper continue making use of it and gain adoption.

Vyper is using it now only because no suitable alternative is available. Status quo of an experimental project does not a motivation make.

I think when we were talking about default args, the solution is for a jump table instead of a call to self, so I think this call-to-self thing has to get changed at some point, it's too limiting.

But, I stand with @maurelian in saying that any such change should ensure our security properties are maintained. Something like an out of bounds jump or privileged external call should be provably not possible.

This is an internal change, so we have more leeway to get things done, but it will affect user expectations moderately.

Please see comment https://github.com/ethereum/vyper/issues/952#issuecomment-407361410.

I started down the path of using delegatecall - as it does maintain msg.sender however this isn't an option because we can not enforce the private nature of the call. As mentioned in the comment, passing through the original address is quite doable - and we can the keep using CALL.

Also note, I think we should split this issue into two changes/ideas.

1.) Using jump table mechanics to improve gas on internal function calls.
2.) Getting msg.sender to equal the calling contract value.

@jacqueswww

In my opinion 1) should be high priority since I have smart contracts in the works that I hope to deploy to mainnet where gas cost is important and this change would save a few thousand :)

1) is hard, but if we do 2) now then we can drop it in in a minor release without breaking user expectations, which is a bigger problem IMO

2) could also be done by adding a new variable msg.caller which is always equal to the calling contract value, and leaving msg.sender as is.

msg.caller could be best practice for most uses, but msg.sender would be available if its needed to determine calling context. Also this would allow msg.sender to function the same between Solidity and Vyper which I think would eliminate a lot of confusion for devs migrating over from Solidity.

Will's Plan 2 -- The Expensive Reusability Manifesto

This plan is extremely expensive in gas. It is useful only for academic purposes. However it allows code reusability and very easy formal auditing.

  1. At every function call do SSTORE(ffff, msg.sender) if it is not already set, and unset before the function ends
  2. Create a read-only variable to access this information
  3. Continue to use external message calls

To be clear, this is joke plan. It illustrates why we should proceed with Will's Plan 1.

@fulldecent there is no need to do it like the above. One can just use the first 20 bytes of the internal calldata to send through the msg.caller, this would be a very minimal cost.

Also we can make it even more effecient, by only passing the data through if msg.caller is actually referenced in the function.

I think this whole discussion also highlights the fact that we should start doing some example gas cost comparisons to solidity.
I know for instance that vyper might seem very ineffecient, and this a good place we can improve. But some things are not as bad as they seem - good example our arithmetic has overflow checking built in, and solidity uses internal calls to do the same to SafeMath.

@ben-kaufman @fubuloubu @haydenadams @haydenadams What example contracts would you recommend? ERC20 & ERC721 seem like good options?

If you can provide me with an all-best-practices-followed example Vyper project repository (including test cases) then I can get my team to implement 721 for this experiment!

Starting with a proof of concept of using a jump table, just so you guys know if you don't see me. :tongue:

"Where'd Jacques go?"

"Oh yeah, he's jumping tables now"

@jacqueswww What was the decision here? You mentioned msg.caller in one of your replies. Was that a typo or are you doing the msg.sender and msg.caller plan?

@haydenadams currently I am only working on 1.) Using jumps to improve gas efficiency of internal calls. I decided this was a worthy task, as unfortunately smart contract development is very gas-efficiency dependant at this stage, at the benefits would be quite significant.

{tl;dr: 2.) has not been fully decided yet.} With regards to msg.sender if I just implement the jumps, msg.sender will match solidity in it's behaviour (of internal functions ) ie. msg.sender will be the top public function's calling account address. However if there is consensus that we should use msg.sender == the contract address and msg.caller == original caller (Basically sender address of the public function), this would be simple enough to change.

FWIW I do think following solidity here might be the safest option, it is really confusing for developers coming from solidity to have msg.sender be something different. Likewise a first time user coming from vyper, and going to solidity will be confused.

I can support copying Solidity behavior.

I can also support using words from Yellow Paper -- ORIGIN, CALLER.

msg.origin?

Ha @maurelian! You are such a sniper, I was waiting until the EIP was merged as a draft.

So basically this EIP resulted directly from discussing #901, having the EIP accepted will both benefit solidity and vyper :)

@fulldecent you expressed opposition to a hypothetical gas cost reductions on calls to self. The community would benefit from hearing your concerns here.

@maurelian Thank you for the ping. I have reviewed and added all my comments to https://ethereum-magicians.org/t/eip-1380-reduced-gas-cost-for-call-to-self/1242/3

Was this page helpful?
0 / 5 - 0 ratings

Related issues

domrany64 picture domrany64  路  3Comments

ben-kaufman picture ben-kaufman  路  4Comments

nrryuya picture nrryuya  路  3Comments

travs picture travs  路  3Comments

fubuloubu picture fubuloubu  路  3Comments