Vyper: VIP: Merkle Proof built in function

Created on 5 May 2018  路  13Comments  路  Source: vyperlang/vyper

Preamble

VIP: 806
Title: VIP Merkle Proof built in function
Author: @haydenadams @fubuloubu 
Type: Standard Track 
Status: Draft
Created: 2018-05-04

Simple Summary

Built in function to generate merkle proofs.

Abstract

Implement a built-in function for computing merkle roots from merkle proofs.

Motivation

Merkle proofs are useful and awesome, and can be difficult to implement. Having a built in function similar to ecrecover would greatly simplify these proofs.

Specification

In order to implement as a constant time function, we need to specify a maximum depth of the tree that we can compute the root hash for. A depth of 32 would be 4bn leaves, which should sufficient for nearly all applications.

The function inputs would be a static array of 32 256-bit hashes in a merkle tree, in increasing order form, and an integer between 2-32 specifying the depth.

"Increasing Order Form" means the first element is hashed with the second, which is hashed with the third, etc. until the depth is met. The caller would have to provide their proof in this structure.

This function will return the resulting merkle root of the proof, which can be compared to a stored value or stored itself.

Backwards Compatibility

New feature

Copyright

Copyright and related rights waived via CC0

Most helpful comment

By removing the root hash from this function, you can also compute a root hash for storage. It wouldn't lock you only to verification of the proof.

Also, in your implementation, why do you switch the hashes based on comparision? Does the merkle proof algorithm require strict ascending order?

Best use case of this would be for implementation of Plasma contracts and other similar proof-based paradigms

All 13 comments

Definitely post-beta but would be really useful for implementing plasma chain contracts!

@haydenadams I had this lying around. I think this is the function you had in mind (to be placed in the 'stdlib') ?

@constant
@public
def verify(proofs: bytes32[100], root: bytes32, leaf: bytes32) -> bool:

    computed_hash: bytes32 = leaf
    for proof in proofs:
        if leaf < proof:
            computed_hash = sha3(concat(computed_hash, proof))
        else:
            computed_hash = sha3(concat(proof, computed_hash))

    return computed_hash == root

Then I am not sure about generating proofs on contracts, what is the use case for that?

By removing the root hash from this function, you can also compute a root hash for storage. It wouldn't lock you only to verification of the proof.

Also, in your implementation, why do you switch the hashes based on comparision? Does the merkle proof algorithm require strict ascending order?

Best use case of this would be for implementation of Plasma contracts and other similar proof-based paradigms

@jacqueswww should we explore this now?

My suggestion for API:

Merkle root calculator macro: calc_merkle_root(leaf, path, proof) -> bytes32

  1. leaf would be of a type that is hashable to obtain the leaf hash that starts the proof. This may include structs (which are RLP-encoded before being hashed), and basically any datatype we have that keccak256(x) accepts as an argument
  2. 'path' would be a basic type (less than 32 bytes) and is used to merkelize the branch to obtain the root. Note that LSB corresponds to the leaf hash, and each path choice corresponds to each bit moving up the path. The value is converted to an integer internally.
  3. proof would be a static list of bytes32 objects. The size of this list determines the depth of the tree, as well as the height of the MSB corresponding to the root path choice in path.

For example, for a tree depth of 160 (length of an address, which we use as the path) and a leaf that is a uint256 value, the signature would be calc_merkle_root(leaf: uint256, path: address, proof: bytes32[160]) -> bytes32.

This widget can be used to calculate roots for both validation against known roots and creation of new ones (Plasma Cash uses both).

Additional note: a constant EMPTY_MERKLE_BRANCH[N] should be added that corresponds to the default case of a tree with all leaves set to bytes32(0) for depth N. This is useful for creation of new roots, e.g. calc_merkle_root(1, msg.sender, EMPTY_MERKLE_BRANCH[160] in our previous example.

A modification of this proposal:

leaf should be the node itself (the hashed content or value) and not the value at the child of the leaf node in the tree. This allows more flexibility in how this API would be used. The suggestion would look like:

assert self.root_hash == calc_root(key, keccak256(value), proof)

Instead of

assert self.root_hash == calc_root(key, value, proof)

Could also make the argument that key should have to be a 256-bit integer type and the path-length for the macro gets derived from the proof length, to make this as easy as possible to implement

This is a different design from Bryant's proposal but I developed an implementation and a test for people who use it before it gets built-in.

@nrryuya should add your assumption of the layout on the variables like I did. It's important to get those right.

The reason why I did mine the way I did was that the depth of the tree grows the size of the list, so deeper elements would get pushed down further instead of appended in front. This matches the other implementations within https://github.com/ethereum/py-trie by @pipermerriam et al. The choice of MSB in the keypath was done similarly.

Really important we are consistent with this.

NOTE: I added an SMT class in py-trie. You should probably test against that. (Also log any inconsistencies in my approach)

@fubuloubu
Thank you! I'll look into that.

@fubuloubu I think it's time to get this into master;) is my suggested implementation ok or should I use FVyper one?

I think we should talk about this one at the next meeting. A couple things I would like to add for flexibility and correctness:

  1. The function args should be key, leaf_hash, and proof[N] where N is fixed and the size of the key.
  2. Branch/proof is provided in root -> leaf order, key is in MSB (root) to LSB (leaf) order
  3. The function name should be calc_root and only calculate the root from proofs and leaf hash, not verify it
  4. Allow the user to specify which hash function to use e.g. calc_root(..., function=hash_function)
  5. Allow list constants for empty proof with a helper function e.g. EMPTY_BRANCH: bytes32[N] = generate_empty_proof(N, function=hash_function)

This would produce an example that looks like this:

N: constant(uint256) = # whatever depth the trie is

def validate_root(root_hash: bytes32, key: uint256, leaf: AnyType, proof: bytes32[N]):
    leaf_hash: bytes32 = keccak256(leaf)
    assert root_hash == calc_root(key, leaf_hash, proof, function=keccak256)

Meeting Notes: @fubuloubu please wrap the comment into a VIP, and then we can close this one in favour of this one.

Deprecating in favor of #1391, which is more well-defined.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

jacqueswww picture jacqueswww  路  4Comments

ben-kaufman picture ben-kaufman  路  3Comments

robinsierra picture robinsierra  路  3Comments

fubuloubu picture fubuloubu  路  3Comments

ben-kaufman picture ben-kaufman  路  4Comments