Vyper: Allow default parameters

Created on 17 Jun 2018  路  22Comments  路  Source: vyperlang/vyper

This is a reduced scope feature as compared to adding Function Overloading #884.

Introduction

ERC-721 includes two functions with the same name but one includes an extra parameter. The ERC specification states that the less parameter one will function equivalently as if the more parameter one were called with a specific value.

Specification

Allow default parameters at function declaration.

Here is a very relevant example. Includes NatSpec documentation as per #898.

Sorry, not sure if my bytes declaration is correct.

## 
# @notice Transfers the ownership of an NFT from one address to another address
# @dev Throws unless `msg.sender` is the current owner, an authorized
#  operator, or the approved address for this NFT. Throws if `_from` is
#  not the current owner. Throws if `_to` is the zero address. Throws if
#  `_tokenId` is not a valid NFT. When transfer is complete, this function
#  checks if `_to` is a smart contract (code size > 0). If so, it calls
#  `onERC721Received` on `_to` and throws if the return value is not
#  `bytes4(keccak256("onERC721Received(address,uint256,bytes)"))`.
# @param _from The current owner of the NFT
# @param _to The new owner
# @param _tokenId The NFT to transfer
# @param data Additional data with no specified format, sent in call to `_to`
@public
@payable
def safeTransferFrom(from: address, to: address, token_id: uint256, data: bytes = ""):
    ...

The Vyper compiler will process this as two separate function selectors. Actually (1 + number of default parameters) function selectors. This follows the Python convention of allowing parameters to be omitted if a default parameter is specified. This differs from Python in that Python function arguments can be specified out of order.

The Vyper compiler can optimize byte code more efficiently than Solidity currently does because Vyper will have more knowledge about the relationship between safeTransferFrom(a,b,c,d) and safeTransferFrom(a,b,c).

Literature review

Vyper has expressed opinions on function overloading in general:

  • This can cause lots of confusion on which function is called at any given time.
  • It makes the code much harder to search through as you have to keep track on which call refers to which function.
  • Thus it's easier to write missleading code (foo("hello") logs "hello" but foo("hello", "world") steals your funds).

Strictly speaking, a default parameter is a function overload. From the EWM there are two function descriptors, but from Vyper, there is one function. I believe the opinions above apply to allowing arbitrary function overloads, whereas this issue is only a very specific kind of overload.

With this proposed feature, we centralize implementation for both code paths in one function. This solves the issue of tracking which call refers to which function, they are the same function. Also this implementation is much safer than Solidity. In Solidity, as this can cause confusion. The foo() logs but foo(param) steals funds example above does not apply because BOTH are the same function.

Discussion

Honestly I don't feel like this is a compromise. I believe this actively promotes Vyper's goal to improve human-readability of code and prevent misleading code.

Pinging people from function overloading discussion: @maurelian @fubuloubu @jacqueswww @ben-kaufman

Approved Discussion

Most helpful comment

My suggestion which I think was generally agreed upon in chat

  • No upper limit
  • Issue a warning for argument counts >=N (default 7?)
  • clearly document that high numbers of default arguments will cause an exponential size increase in compiled bytecode.

All 22 comments

def overloaded_function(a: uint256, b: bytes32=""):
    ...

Creates sha3('overloaded_function(uint256)') which calls sha3('overloaded_function(uint256,bytes32)') with "" as a second arg. Avoids having to define overloads and the possibility of an overloaded function having different functionality.

I like it! Very pythonic solution to the problem!

Can also specify non-zero default args

Yes, at first glance I like it too. :)

This is not my final opinion, let me mull this over a bit some more (weekend brain).

Marked as Approved.

The Vyper compiler will process this as two separate function selectors. Actually (1 + number of default parameters) function selectors.

I think this is not accurate if we assume keyword arguments are in play. It will be (n * (n + 1)) / 2 additional function selectors. For example,

def overloaded_fn(a: uint256=3, b: uint256=4):
    ...

This would be equivalent to the following three functions f(n) => (2 * (2 + 1)) / 2 => 3

def overloaded_fn(1):  # a=1, b=4
def overloaded_fn(b=1):  # a=3, b=1
def overloaded_fn():  # a=3, b=4
    ...

~Yes that is correct, my idea would also be not to allow more than 2 default function parameters (the one I picked), As anything more than 2 would just produce too many combinations.~ Will think if this is even necessary.

Restrict to a maximum of 3 default arguments? (6 generated function signatures)

My suggestion which I think was generally agreed upon in chat

  • No upper limit
  • Issue a warning for argument counts >=N (default 7?)
  • clearly document that high numbers of default arguments will cause an exponential size increase in compiled bytecode.

Out of curiosity, is there any sense of the cost in terms of runtime bytecode for each new selector? Is it only the code required to dispatch and JUMP to the corresponding code segment, or will the bytecode for the function body also need to be duplicated?

The way we were discussing it, it would be a dispatch statement. So basically, load the stack with the call args and/or defaults, then jump to the code.

The problem is the exponential growth (7 default args produces 28 different calling loaders). At some point it becomes untenable.

7 default args is 2^7 = 128 loaders if you allow arbitrary order. If you only allow the right-most arguments to be omitted then that is 7+1 = 8 loaders. Both solutions are probably fine because it is the programmer's damn fault if they are designing an external function that way!

Bytecode for the function body does not need to be duplicated.

Also if you are only allowing right-most arguments to be omitted then this allows you to implement in an iterative fashion:

callFunction(1,2,3, defaultA, defaultB) encodes like this:

...
PUSH1
PUSH2
PUSH3
JUMP: callFunction_with_2_defaults:

callFunction_with_2_defaults:
PUSH defaultA
JUMP callFunction_with_1_default:

callFunction_with_1_default:
PUSH defaultB
JUMP callFunction:

callFunction:
Now you have 1,2,3,A,B on stack, ready to go
...

(7 * (7 + 1)) / 2 = 28 is the right number, because order is important but you don't have to necessarily require right to left ordering of default value settings (e.g. we don't need to disallow myFunction(a=1, b, c=3) if we don't want to)

Regardless, jumping to each signature and loading the corresponding default is a good implemetation suggestion if we require right to left ordering of defaults. I would suggest this be the final implemetation when we have figured out the whole internal call thing.

EDIT: I think you're right, N + 1 is the real answer to the growth of the signature stack. So it's linear, not exponential. We probably don't even need a special case for this, I don't think anyone would legitimately use more than 7 default args

Starting on this.

Observation I have made so far.

Because of the way that method_id gets generated we can not allow two default arguments of the same type, as the sha3 method signature would be the same. I am handling this case, but just something to note.

image

Sure you can! Anonymous default parameters can only be the rightmost parameters. So

fun(int a, int b=5, int c=6) is unambiguous when you call fun(7, 2).

so, by that logic:
fun(7, 2) (analogous to fun(7, b=2)) calls keccak('fun(int,int)') and fun(7, c=2) calls keccak('fun(int,int)')?

fun(7,2) calls keccak('fun(int256,int256)') and fun(7) calls keccak('fun(int256)').

Three external entry points total.

So fun(7, c=2) is disallowed? If you want to call with arg c, you would have to fully specify e.g. fun(7, 5, 2)

EDIT: This is where I got the fall-through idea from @jacqueswww

aka this is the reason why the number of function selectors is n + 1 and not (n * (n + 1)) / 2 (where n is the number of args that have defaults)

Ah. For internal it does not matter. There is no function selector for internal calls.

@fulldecent there still needs to be some unique identifier to jump to though ;) (and until #901 gets implemented still needs the method_id)

So I have implemented the necessary changes, and it's looking good, at this stage it still uses the fun(7, c=3) style, because based on the comments when I started this was the agreed upon approach ie. (2^n function signature variations).

If everyone one is OK with moving to positional variations (n+1) only. I will make the changes.
My vote goes to positional args only, because it make the number of possible variations much less (:+1: for yes, :-1: for no).

Basically:

fun(a: int128, b: int128=1, c: int128=2) will only have the signatures:

fun(int128)  # a = ?, b = 1, c = 2
fun(int128, int128)  # a = ?, b = ?, c = 2
fun(int128, int128, int128)  # a = ?, b = ?, c = ?

Exposed on the ABI.

The unique identifier is the JUMPDST instruction byte offset!


But yes, I fully agree with your proposal here.

Yes but in the IR, you'd still generate a unique label/symbol that's all I meant ;)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

nrryuya picture nrryuya  路  3Comments

haydenadams picture haydenadams  路  3Comments

domrany64 picture domrany64  路  3Comments

ben-kaufman picture ben-kaufman  路  3Comments

vici0 picture vici0  路  3Comments