VIP: <to be assigned>
Title: reduce the cost and difficulty of code reuse in common situations
Author: maurelian
Status: Draft
Created: 2018-06-16
Requires (*optional): <VIP number(s)>
Replaces (*optional): <VIP number(s)>
In order to call a within the same contract function, Vyper uses the CALL opcode, which send a new message to call functions within the same contract. This has some nice safety benefits.
Unfortunately it will result in developers using anti-patterns to save on gas, or access msg.sender in a function call.
The benefit of using CALL to access code in the same contract is to create a new execution context, with no risk of side effects from memory access. I admit that I find this to be quite an elegant use of the EVM for safety.
CALLUnfortunately, there are two major issues with the use of the CALL opcode for calling functions within the contract:
At least two important environment opcodes will return different values: CALLER (ie. msg.sender) and CALLVALUE (ie. msg.value). This complicates the use of functions for permission checking. The following is a naive vyper translation of the common "ownable" pattern for a simple storage solidity contract.
owner: address
value: public(int128)
@public
def __init__():
self.owner = msg.sender
@private
def checkOwner():
assert msg.sender == self.owner # this will REVERT every time
@public
def writeValue(_value: int128):
self.checkOwner()
self.value = _value
The CALL opcode costs 700 gas (+ input data costs + function dispatching again). By contrast Solidity uses JUMP (2 gas) to call a function.
This imposes a large penalty on code reuse. Regardless Vyper's design philosophy favoring safety over gas efficiency, developers will respond to this incentive by simply copying and pasting boilerplate code. This is dangerous, and difficult to audit.
I'm sorry, I don't have one specification. Below are some options I see for mitigating these issues.
From the Motivation section above, we have 3 parameters along which to analyze approaches to calling a function in the EVM.
DELEGATECALL instead of CALLCALLER and CALLVALUE)Seems like a decent solution!
~3. Memory Safety: BAD (executes the code in the same memory context, nullifying reasons for using CALL in the first place)~ ~I proposed this earlier, but I don't think this is a good solution. ~
JUMP like SolidityCALLER and CALLVALUE)This is worth considering.
See also #330.
These could have significant restrictions on them, like no access to MSTORE or SSTORE, and not accepting arguments. I assume this would be done by with JUMP, or just inlining the same code each time it's needed.
CALLER and CALLVALUE)Built-in vars like caller and callvalue can be created, which are magically passed to a function in a message call.
My code above would thus be:
@private
def checkOwner():
assert _sender == self.owner # <- replaced msg.sender with a magical global variable `sender`
@public
def writeValue(_value: int128):
self.checkOwner)
self.value = _value
This feels a bit funky to me. Probably would be hard to implement safely, and obscures the true functioning of the EVM.
My problem in the code sample above can be addressed by passing the values I need as arguments:
@private
def checkOwner(_sender: address):
assert _sender == self.owner # <- replaced msg.sender with _sender
@public
def writeValue(_value: int128):
self.checkOwner(msg.sender) # <- pass msg.sender as an argument
self.value = _value
Maybe that's OK? But it requires a deeper understanding of how both the EVM and Vyper work.
Dependent on approach.
Copyright and related rights waived via CC0
I do find myself avoiding internal calls in favor of reusing the same code in multiple places to save gas, which is probably not ideal. I don't personally know the tradeoffs well but If this can be done safely I'm all for it.
I get your point but I don't fully agree with that. Unlike other software, smart contracts' code must be explicit and easily readable. In most other cases, it's more important to not repeat yourself in the code but in smart contracts, I think readability of the function should be more important.
In the example about you are trying to mitigate the behavior of modifiers which Vyper intentionally excluded.
Modifiers - e.g. in Solidity you can do function foo() mod1 { ... }, where mod1 can be defined
elsewhere in the code to include a check that is done before execution, a check that is done after
execution, some state changes, or possibly other things. Vyper does not have this, because it makes > it too easy to write misleading code. mod1 just looks too innocuous for something that could add
arbitrary pre-conditions, post-conditions or state changes. Also, it encourages people to write code > where the execution jumps around the file, harming auditability. The usual use case for a modifier is > something that performs a single check before execution of a program; our recommendation is to
simply inline these checks as asserts.
This is the reason Vyper don't have modifiers like Solidity in the first place.
I am in favor of your fourth proposal of creating the caller and callvalue.
I would like to hear more opinions on that as code repetition is an insecure thing as well so I'm not completely sure what's more important yet, but I believe there's a room for improvement.
Idk if I agree that internal function calls have the same readability issues as Solidity modifiers. Also code repetition can feel very verbose and reduce readability. If you need to do a square root in five different functions, its probably bad to have the sqrt() implemented five times. But people will be tempted to do so if they save 1000 gas per function call. Discouragement through high gas cost makes sense on a protocol level (ie making sstore exepensive) but feels wrong for smart contract languages.
To be clear I don't think
- Environment values change unexpectedly
At least two important environment opcodes will return different values: CALLER (ie. msg.sender) and CALLVALUE (ie. msg.value).
is a big problem. I do however think
- Gas costs disincentivizes code reuse
The CALL opcode costs 700 gas (+ input data costs + function dispatching again). By contrast Solidity uses JUMP (2 gas) to call a function.
is bad. If using JUMP for internal calls can be done as safely as CALL, it should be used. I don't know the security tradeoffs at all, hopefully someone who does can chime in here!
It's not only about readability, but maintainability as well. More mistakes will be made if 5 pieces of code have to change.
Take a look at these @private utilities I wrote for my ERC721 translation:
There are a total of 12 lines of code in the bodies, which can be written once, then reused 3 times in the standard. If they were written 3 times over, it would be very hard (for my brain at least) to read carefully and catch any differences between them.
6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function. This would limit functions that make external calls from having advanced memory access while allowing safe internal calls to reduced their gas count. This will also incentivize isolation/reduction of external calls to improve gas costs.
6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function.
This optimization means that source code using msg.sender would behave differently based on a decision by the compiler. It might work if the developer could specify a flag (@memsafe, @pure ?), indicating they would like to use JUMP and the compiler makes a check to ensure it's safe.
To be clear I don't think "Environment values change unexpectedly" is a big problem.
@haydenadams No? IMO, the current behaviour of this code is highly non-intuitive, and will make a lot of common operations much more difficult.
```python
@public
def getCaller() -> address:
return msg.sender
@public
def alsoGetCaller() -> address:
return self.getCaller() # always returns the contract's own address!
```
@maurelian
IMO, the current behaviour of this code is highly non-intuitive, and will make a lot of common operations much more difficult.
Whatever address calls a function is msg.sender within the scope of that function. That is the only rule right now. Its not a massive leap to realize that extends to self for internal calls. When a public function is called I don't think it should react differently if its being called by itself, another smart contract, or an external account.
"on the internet nobody knows you're a dog" -> "on Ethereum nobody knows you're 潭a潭 潭s潭m潭a潭r潭t潭 潭c潭o潭n潭t潭r潭a潭c潭t潭 yourself"
With account abstraction the divide between contracts and accounts will go from negligible to nonexistent. Contracts will get more complicated but also more versatile. Its pretty important that smart contract devs understand Ethereum account structure. I would prioritize simplicity of logic over what feels immediately intuitive here. Totally understand where you're coming from though!
This seems reasonable, and personally I like it:
@private
def checkOwner(_sender: address):
assert _sender == self.owner # <- replaced msg.sender with _sender
@public
def writeValue(_value: int128):
self.checkOwner(msg.sender) # <- pass msg.sender as an argument
self.value = _value
We briefly discussed this in the bi-weekly meeting yesterday, but left it as unresolved. As it's quite a significant change, we have to study all the options carefully ;)
As it's quite a significant change, we have to study all the options carefully ;)
@jacqueswww I certainly agree.
@haydenadams it's straightforward to make small changes to my little example to make it work, but it's always going to be confusing.
Here's an other summation:
IF: a contract can call its own @public functions;
THEN: msg.sender is pretty much useless as a builtin variable, because it means something entirely different when called externally vs. internally. It's only useful when passed as an argument.
If the Vyper team decides to move towards the external and internal visibilities as in solidity, this would be fine. (added to OP)
Finally, from the Auditability principle:
Simplicity for the reader is more important than simplicity for the writer, and simplicity for readers with low prior experience with Vyper (and low prior experience with programming in general) is particularly important.
Outside of project contributors, requiring this level of nuance is far from simple for anyone reading code.
I agree that it's confusing that msg.sender changes, and will be a major annoyance going forward.
Short/little fixes I will investigate:
1.) disable accessing msg.sender or msg.value in private functions.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.
Actually, I think you are right! I tested with this code in remix:
library Lib {
function foo(uint a) returns(uint) { }
}
contract B {
using Lib for uint;
function bar() returns(uint){
uint a = 1;
return a.foo();
}
}
If you call B.bar(), and look at the debugger, after the DELEGATECALL instruction memory is reset.
My bad. I'll update the VIP.
@jacqueswww
1.) disable accessing
msg.senderormsg.valuein private functions.
I like this. It gets rid of most of the confusion, and they're both useless in private functions anyway.
@haydenadams I think "useless" might be an oversimplification, but it forces a specific design behavior that ultimately is very practical in reducing your attack surface. Basically, private functions should do very specific internal behaviors, preferably dealing only with internal state.
1.) disable accessing msg.sender or msg.value in private functions.
2.) Investigate using DELETEGATECALL -> I was under the impression that this also resets the VM.
To be clear, I'm in favor of either but not both. Preference for #2.
I am also leaning towards delegate call.
But definitely feel we might need a more gas efficient dispatcher+memory stack post beta.
More gas efficient dispatcher can be done without breaking expectations if we make the right move now.
Which one makes it easier to implement gas optimization on the dispatcher?
If switching to delegatecall mean msg.sender in internal calls to public functions will no longer return the contracts own address I'm strongly against it.
To better understand and make a decision here, it will be helpful if the VIP will include discussion of which stage of the compilation process we are changing. Also a discussion of use cases at that stage of compilation would help me.
Also, if you can give a brief overview of the compilation strategy for newcomers this would help them join the discussion.
@fulldecent and everyone: Looking again, I think it makes sense to reduce the scope of this VIP. A better title might be "Cost and usability Improvements to internal function calls". 馃憤/ 馃憥?
Code reuse in general, (ie. enabling imports and something like inheritance/composability) is probably another much needed VIP, or perhaps a few.
Or "New feature: call a function without using EVM CALL"
I think this is modification to existing behavior, not a new feature proposal.
We already have a composibility VIP I believe, as well as one for importing interfaces
In the mid to longer term, I can imagine an EIP to significantly reducing the cost of a contract to CALL or DELEGATECALL its own address might be well received.
I would vote against such an EIP. Call-to-self has extremely limited utility.
I believe the issue is that even calls to @private functions use CALL, which is very gas heavy compared to Solidity using JUMP. I believe this is for safety reasons but unfortunately discourages the use of @private functions.
@fubuloubu's response earlier in the thread:
6th option?: an optimization function that will perform a JUMP instead of a CALL if it is able to be validated that the code jump cannot access anything outside of the called function. This would limit functions that make external calls from having advanced memory access while allowing safe internal calls to reduced their gas count. This will also incentivize isolation/reduction of external calls to improve gas costs.
seems like something worth pursuing.
Copying from Gitter.
Will's code reuse proposal. This is a FOR NOW proposal, it is implementable and it solves nearly all problems for today. I even think it is good enough to get Vyper to production version 1.0. Further improvements are possible but I think they can safely be out of scope of 1.0:
^^ This is a "manifesto" if you have followed along similar initiatives in Swift evolution.
@fulldecent mostly on board
Solidity still has @public functions which can be called internally or externally.
True. I am recommend a more explicit @external + @internal to be the Vypery equivalent of Solidity's @public.
Call-to-self has extremely limited utility.
Vyper is currently using it for all self function calls. It improves memory safety. Given the value of security, it's worth consideration should Vyper continue making use of it and gain adoption.
Vyper is using it now only because no suitable alternative is available. Status quo of an experimental project does not a motivation make.
I think when we were talking about default args, the solution is for a jump table instead of a call to self, so I think this call-to-self thing has to get changed at some point, it's too limiting.
But, I stand with @maurelian in saying that any such change should ensure our security properties are maintained. Something like an out of bounds jump or privileged external call should be provably not possible.
This is an internal change, so we have more leeway to get things done, but it will affect user expectations moderately.
Please see comment https://github.com/ethereum/vyper/issues/952#issuecomment-407361410.
I started down the path of using delegatecall - as it does maintain msg.sender however this isn't an option because we can not enforce the private nature of the call. As mentioned in the comment, passing through the original address is quite doable - and we can the keep using CALL.
Also note, I think we should split this issue into two changes/ideas.
1.) Using jump table mechanics to improve gas on internal function calls.
2.) Getting msg.sender to equal the calling contract value.
@jacqueswww
In my opinion 1) should be high priority since I have smart contracts in the works that I hope to deploy to mainnet where gas cost is important and this change would save a few thousand :)
1) is hard, but if we do 2) now then we can drop it in in a minor release without breaking user expectations, which is a bigger problem IMO
2) could also be done by adding a new variable msg.caller which is always equal to the calling contract value, and leaving msg.sender as is.
msg.caller could be best practice for most uses, but msg.sender would be available if its needed to determine calling context. Also this would allow msg.sender to function the same between Solidity and Vyper which I think would eliminate a lot of confusion for devs migrating over from Solidity.
This plan is extremely expensive in gas. It is useful only for academic purposes. However it allows code reusability and very easy formal auditing.
To be clear, this is joke plan. It illustrates why we should proceed with Will's Plan 1.
@fulldecent there is no need to do it like the above. One can just use the first 20 bytes of the internal calldata to send through the msg.caller, this would be a very minimal cost.
Also we can make it even more effecient, by only passing the data through if msg.caller is actually referenced in the function.
I think this whole discussion also highlights the fact that we should start doing some example gas cost comparisons to solidity.
I know for instance that vyper might seem very ineffecient, and this a good place we can improve. But some things are not as bad as they seem - good example our arithmetic has overflow checking built in, and solidity uses internal calls to do the same to SafeMath.
@ben-kaufman @fubuloubu @haydenadams @haydenadams What example contracts would you recommend? ERC20 & ERC721 seem like good options?
If you can provide me with an all-best-practices-followed example Vyper project repository (including test cases) then I can get my team to implement 721 for this experiment!
Starting with a proof of concept of using a jump table, just so you guys know if you don't see me. :tongue:
"Where'd Jacques go?"
"Oh yeah, he's jumping tables now"
@jacqueswww What was the decision here? You mentioned msg.caller in one of your replies. Was that a typo or are you doing the msg.sender and msg.caller plan?
@haydenadams currently I am only working on 1.) Using jumps to improve gas efficiency of internal calls. I decided this was a worthy task, as unfortunately smart contract development is very gas-efficiency dependant at this stage, at the benefits would be quite significant.
{tl;dr: 2.) has not been fully decided yet.} With regards to msg.sender if I just implement the jumps, msg.sender will match solidity in it's behaviour (of internal functions ) ie. msg.sender will be the top public function's calling account address. However if there is consensus that we should use msg.sender == the contract address and msg.caller == original caller (Basically sender address of the public function), this would be simple enough to change.
FWIW I do think following solidity here might be the safest option, it is really confusing for developers coming from solidity to have msg.sender be something different. Likewise a first time user coming from vyper, and going to solidity will be confused.
I can support copying Solidity behavior.
I can also support using words from Yellow Paper -- ORIGIN, CALLER.
msg.origin?
Ha @maurelian! You are such a sniper, I was waiting until the EIP was merged as a draft.
So basically this EIP resulted directly from discussing #901, having the EIP accepted will both benefit solidity and vyper :)
@fulldecent you expressed opposition to a hypothetical gas cost reductions on calls to self. The community would benefit from hearing your concerns here.
@maurelian Thank you for the ping. I have reviewed and added all my comments to https://ethereum-magicians.org/t/eip-1380-reduced-gas-cost-for-call-to-self/1242/3
Most helpful comment
Also note, I think we should split this issue into two changes/ideas.
1.) Using jump table mechanics to improve gas on internal function calls.
2.) Getting
msg.senderto equal the calling contract value.