Vyper: VIP: Named Structs

Created on 30 Jul 2017  路  22Comments  路  Source: vyperlang/vyper

Simple Summary

Add the ability to define custom structure types with a given alias

Motivation

Contract writers often work with structures very extensively. It is very helpful to be able to give commonly-used structure types a reference so that they can be used in many places. Without this functionality, it would not be possible to write clear, efficient contracts for things like Plasma.

Specification

The clearest specification we have come up with would be to adopt this syntax for defining structs (from: https://github.com/ethereum/vyper/issues/300#issuecomment-431570905):

struct MyStruct:
    x: address
    y: bytes32

This new type would be available only at compile time for use in both defining globals and assigning to globals as follows:

s: MyStruct

def set(x: address, y: bytes32):
    self.s = MyStruct(x, y)

Additionally, it should be possible to construct a structure in memory, and use it as a variable:

s: MyStruct = MyStruct(x, y)  # or possibly allow {x: x, y: y}
self.s = s  # Write storage from memory as an optimization?

This VIP serves as a basis for #1019 and #1020

Backwards Compatibility

Fully backwards compatible.

Copyright

Copyright and related rights waived via CC0


Original Proposal (for context):

I was working on another example and was wondering if there is any planned support for creating or aliasing new types so that I wouldn't have to carry them around e.g.

newtype data_t = {property1: basetype, property2: basetype, etc...}
data_lookup: data_t[address]

...

@constant
def get_property1(_addr: address) -> basetype:
    return data_lookup[_addr].property1
Discussion

All 22 comments

@fubuloubu can you explain us a little more about the use cases?

  1. So I've played around writing 2 viper contracts so far and both times I've had a main map construct that contains address -> record pairs for maintaining a record of data associated with an account (address) for that contract (e.g. number of shares owned, delegates, etc.). I was thinking originally that there is a need to be able to use that datatype as a return from a function or whatnot, but viper doesn't allow returning structs and I think that is by design and my misunderstanding. I was thinking from a more general usage standpoint where you might want to get all of the data from a record by account instead of specifying a getter for each property in that record. Also, not totally clear on this, but can you just read any data present in a contract at any time by externally reviewing the storage in the blockchain? (e.g. not in viper/solidity) Total newb here :hand:

  2. Sort of related, it would also be nice to define your own units. For example, newtype shares_t = num( shares ) and latter use it like holdings: shares_t[address] and def get_shares(_addr: address) -> shares_t:, and I can use that everywhere I'm talking about shares in a stocks example, with the added benefit of the type system checking what I'm doing when combining units (e.g. share_price: num( shares_t / wei ), then later self.holdings[_addr] = msg.value * self.share_price).

I think representing more complicated programs could benefit from being allowed to define more types (within reason), but then again I believe viper isn't going to be a language designed for creating larger programs and you can make due with what currently exists in writing a contract (with comments).

Renamed to "Named Structs".

@jacqueswww should we explore this now?

Proposed syntax:

MyStruct: struct = {
    a: address,
    b: bytes32,
    ...
}

structGlobal: MyStruct

def foo() -> MyStruct:  # as per #1019 
    structLocal: MyStruct = MyStruct(msg.sender, b'', ...)
    return structLocal

def bar(_myStruct: MyStruct):  # as per #1019
    self.structGlobal = _myStruct

Note: This is valid Python AST syntax, so no problems.

The StructName: struct = { define struct members here } syntax is intuitive because it is similar to defining a constant. The context switch introduced by the = sign is enough to set it a part from specifying globals.

We could require all struct types be defined in the section above globals to ensure that they are disambiguated.

Alternatively, this may be even clearer:

MyStruct: struct is {
    a: address,
    b: bytes32,
    ...
}

using the is keyword is even clearer as it is not in use for anything else currently.

Forgive me jumping in here without having a deep understanding of Vyper but seeing the proximity in syntax to Python, why not follow the Python syntax for structs?

So that would be:

struct MyStruct:
    x: address
    y: bytes32

Or if there is a specific reason for the curly braces syntax (that would make them essentially syntactical equivalent to structs in Rust):

struct MyStruct {
    x: address,
    y: bytes32,
}

@cburgdorf Vyper uses Python AST to parse programs, so all Vyper has to be valid Python, or as close as possible where we can do some string regexing to get valid parsed Python (e.g. the contract keyword gets replaced with class).

That being said, I actually quite like your first approach with the class variable style, and we can do a similar thing to the contract keyword to get it to work so it looks like class MyStruct... under the hood and thus parses. However, this leaves a bit of a quandary... because now we'll have two replacements with the same result, and thus would be logically unable to tell the difference.

My proposal using your suggestion (which is my new favorite) is to under the hood replace struct MyStruct: ... with class MyStruct(struct): ..., and similarly contract ExternalContract: ... with class ExternalContract(contract): .... We can now have one regex replace that works for both rules (and perhaps future ones!) with a style that is very Pythonic but is still clear to the needs of Vyper, and is quite readable!

Thanks for the suggestion!

@jacqueswww what do you think?

Yes, I also quite like the readability of the example. Pretty sure we could figure something out ;)

What do you think about the way we handle it being a regex from (contract|struct) \{varname}: to class \{varname}(\1):?

Ah sweet! That taught me quite a bit about how these things work internally.

My proposal using your suggestion (which is my new favorite) is to under the hood replace struct MyStruct: ... with class MyStruct(struct): ...

Makes sense and seems to work out for a bunch of other things that may be considered in the long run e.g. enums.

enum Color:
    black = 'black'
    white = 'white'

could be preprocessed to something like class Color(VyperEnum): (making Vyper enums actually sweeter than Python enums (which are just classes derived from a special Enum class) :sweat_smile: )

Yes, I've been thinking lately we need enums!

We don't have to give them special names, because after Python AST parsing basically nothing is the same lol. But we don't have to write a compiler front end!


EDIT: enums might actually be a little trickier since we'd need to specify a type... Might need to do something like:

enum Status:
    type: uint256
    UNKNOWN = 0  # Must always specify the default value as something
    GOOD = 1
    BAD = 3. # do we have to go in order?
    ...

Seems nobody is working on this. Can I take a crack at this?

@charles-cooper sure! Do you understand what I mean here? https://github.com/ethereum/vyper/issues/300#issuecomment-431593110

Yep. I opted to add decorators instead but it is essentially the same thing. I will push a WIP PR shortly with my approach.

After working on https://github.com/ethereum/vyper/pull/1102 and getting closer to the issue, I would like to propose the following modifications to the spec:

  1. Struct constructors only accept named tuples. Every member of the struct must be instantiated, and the RHS must not contain any members that are not elements of the struct, otherwise the compiler should throw an error.

    1. Example:

      python struct MyStruct: x: address s: MyStruct = { x: 0x123 } # Good s: MyStruct = { x: 0x123, y: "superfluous element" } # Throw: Unrecognized member s: MyStruct = {} # Throw: Missing member

    2. Motivation: Humans are better at remembering names than argument orders, and this is an added compiler guardrail. It can protect against common mistakes such as reordering the members of a struct and forgetting to update the order at all constructor call sites.

  2. Deprecate anonymous structs, or at least severely limit their usage.

    1. Example:

      python contract_member: { x: uint256 } # Not allowed struct Pair: p1: uint256 p2: uint256 def div(a: { num: uint256, den: uint256}) -> uint256 : # Maybe? Pair p = Pair(a) # Maybe? p = Pair({p1: a.num, p2: a.den}) # Maybe? x: { num: uint256, den: uint256 } = p # Maybe? x = a # Maybe?

    2. Motivation: This might be a little contentious because it breaks backwards compatibility. But I figure pre-release is the best time to break backwards compatibility. Once we have named structs, anonymous structs seem less safe than named structs, which seems important since one of the stated goals of Vyper is to maximize human readability and to maximize difficulty of writing misleading code.

  3. This is more of a comment, but implementing this really illustrates the importance of https://github.com/ethereum/vyper/issues/563 - it shouldn't feel so hacky to add new keywords to Vyper.
  4. Discuss: should we support struct definitions in foreign contract interfaces?

    1. Example:

      python contract ForeignContract: struct ForeignStruct: x: uint256 def proc_struct(arg: ForeignStruct) -> uint256:

    2. Motivation: This seems important to support interop between different contracts once #1019 and #1020 are in the pipeline. Note that it adds complexity around type-checking, because structs in different contracts should not be considered the same even if they have the same fields - e.g. to prevent phishing attacks in https://github.com/ethereum/vyper/issues/1020.

EDIT: I forgot to mention, I think the type-checker should consider two structs as different even if they have the same members. That way casts must be explicit. A convenience function cast or marshal should be provided to make it easier to explicitly cast between structs which are considered equivalent (not sure yet how equivalence should be defined).

  1. I don't disagree with allowing the s: MyStruct = { x: 0x123 } syntax. In practice though, struct members are rarely such short names and there could be many members. A more realistic example would be:
struct Transaction:
    receiver: address
    prevBlock: num256
    amount: uint256

def createTransaction():
    txn: Transaction = {receiver: msg.sender, prevBlock: block.number, amount: msg.value}

# Proposed would disallow:
def createTransaction():
    txn: Transaction = Transaction(msg.sender, block.number, msg.value)  # Fits more lines

I was thinking that allowing both would be optimal, but I see your point and mostly agree with it. We should discuss this further.

  1. Anonymous structs shouldn't be entirely deprecated. In fact, you are using one in your proposed syntax for creation/assignment, so they wouldn't go away anyways. We could pop up a deprecation warning if we find one is used for globals. I think many of the syntaxes you pointed out indeed looks awkward, we should probably avoid enabling too much usages of anon structs (outside of direct assignment).

  2. Yeah, agreed. Do you want to formally verify a brand new Vyper-specific front end? (not being snarky, honest question) We have formal semantics in K for Vyper here. It's not up to date, but can be used to aid in the development of a front end. I would recommend using PLY or SLY (would prefer the latter). I think we're at a stage now where this may make sense, as it seems to be hampering development a bit and would be a good chance to refactor. (Also, a Vyper AST would be amazing for compiler code readability). Definitely a lot of work!

  3. That is a fantastic corner case I did not think of. Your proposal is interesting, but it may be more friendly to use the globally defined structs to fill it out e.g.

struct ForeignStruct:
    x: uint256
contract ForeignContract:
    def proc_struct(arg: ForeignStruct) -> uint256: constant

Having to define them per-contract interface would create excessive complexity, and since the contract calling interface doesn't really care too much about the struct definitions as long as the members match, it wouldn't have much practical implications. This is also why doing a member check on structs is valuable for assignment.

That's my 2 wei

Thanks for the feedback.

  1. In my latest thinking (represented https://github.com/ethereum/vyper/pull/1102/commits/9a732d8a1b3dcbd36727d68964485d5e0187d98d) I decided to require that assignment uses the identical constructor and struct type. I think this leads to the clearest code:
    ```python
    struct Transaction:
    receiver: address
    prevBlock: num256
    amount: uint256
    struct Transaction2:
    receiver: address
    prevBlock: num256
    amount: uint256
def createTransaction():
  txn: Transaction = Transaction({receiver: msg.sender, prevBlock: block.number, amount: msg.value})
  # Multi-line style is encouraged
  txn: Transaction = Transaction({
    receiver: msg.sender,
    prevBlock: block.number,
    amount: msg.value})
  # Disallowed.
  txn: Transaction = {receiver: msg.sender, prevBlock: block.number, amount: msg.value}
  # Disallowed
  txn: Transaction = Transaction2({receiver: msg.sender, prevBlock: block.number, amount: msg.value})
```

  1. In https://github.com/ethereum/vyper/pull/1102/commits/9a732d8a1b3dcbd36727d68964485d5e0187d98d, anonymous structs are still allowed, but I am with you leaning towards disallowing (or deprecating) them as a type, and only allowing dicts in constructors.
  2. Sounds like a big project - maybe once these struct machinery issues are implemented. I definitely agree it would lead to simpler code - a lex/yacc grammar would be easy to understand as opposed to relying on the Python front-end - I don't think at the current time there is even a BNF grammar for Vyper.
  3. I definitely see your point about friendliness. Perhaps we can be less strict for interface definitions and allow anything that 'fits', and be really strict for struct signing.

I think being looser with interface definitions is fine, in that "looser" just means you are allowed to reuse the same structs you've defined locally. We still keep tightness outside of struct definitions, so I really see no issue with that. Interfaces don't execute any code.

  1. & 2. This looks good
txn: Transaction = Transaction({
    receiver: msg.sender,
    prevBlock: block.number,
    amount: msg.value
})

Mainly because I actually suggested this at some point:

txn: Transaction = Transaction(
    receiver=msg.sender,
    prevBlock=block.number,
    amount=msg.value
)

I tend to agree that we could probably drop anonymous struct, just because majority of them are quite large - and just ends up cluttering the logic. ~However considering point 4, I am in two minds if the external interfaces should have their structs defined close to or on the interface definition~ see 4.

I am also of the opinion that we should at least force all members on the RHS to be defined, at least with the first iteration of named structs.

  1. This implies we have to force structs to be defined before the external interfaces (just something to keep in mind). For the first iteration I would say keep struct defined only on the main contract.

Well, if you have multiple contracts (if we enable importing struct and contract interface definitions) you could just do from .contractA import MyStruct, MyContractInterface (where ./contractA.vy is available in the local directory) and that should work the same as if you defined them within the contract file (at least, in my mind)

Implemented in #1102 (yay!)

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jakerockland picture jakerockland  路  4Comments

haydenadams picture haydenadams  路  3Comments

vici0 picture vici0  路  3Comments

domrany64 picture domrany64  路  3Comments

ben-kaufman picture ben-kaufman  路  4Comments