VIP: 806
Title: VIP Merkle Proof built in function
Author: @haydenadams @fubuloubu
Type: Standard Track
Status: Draft
Created: 2018-05-04
Built in function to generate merkle proofs.
Implement a built-in function for computing merkle roots from merkle proofs.
Merkle proofs are useful and awesome, and can be difficult to implement. Having a built in function similar to ecrecover would greatly simplify these proofs.
In order to implement as a constant time function, we need to specify a maximum depth of the tree that we can compute the root hash for. A depth of 32 would be 4bn leaves, which should sufficient for nearly all applications.
The function inputs would be a static array of 32 256-bit hashes in a merkle tree, in increasing order form, and an integer between 2-32 specifying the depth.
"Increasing Order Form" means the first element is hashed with the second, which is hashed with the third, etc. until the depth is met. The caller would have to provide their proof in this structure.
This function will return the resulting merkle root of the proof, which can be compared to a stored value or stored itself.
New feature
Copyright and related rights waived via CC0
Definitely post-beta but would be really useful for implementing plasma chain contracts!
@haydenadams I had this lying around. I think this is the function you had in mind (to be placed in the 'stdlib') ?
@constant
@public
def verify(proofs: bytes32[100], root: bytes32, leaf: bytes32) -> bool:
computed_hash: bytes32 = leaf
for proof in proofs:
if leaf < proof:
computed_hash = sha3(concat(computed_hash, proof))
else:
computed_hash = sha3(concat(proof, computed_hash))
return computed_hash == root
Then I am not sure about generating proofs on contracts, what is the use case for that?
By removing the root hash from this function, you can also compute a root hash for storage. It wouldn't lock you only to verification of the proof.
Also, in your implementation, why do you switch the hashes based on comparision? Does the merkle proof algorithm require strict ascending order?
Best use case of this would be for implementation of Plasma contracts and other similar proof-based paradigms
@jacqueswww should we explore this now?
My suggestion for API:
Merkle root calculator macro: calc_merkle_root(leaf, path, proof) -> bytes32
leaf would be of a type that is hashable to obtain the leaf hash that starts the proof. This may include structs (which are RLP-encoded before being hashed), and basically any datatype we have that keccak256(x) accepts as an argumentproof would be a static list of bytes32 objects. The size of this list determines the depth of the tree, as well as the height of the MSB corresponding to the root path choice in path.For example, for a tree depth of 160 (length of an address, which we use as the path) and a leaf that is a uint256 value, the signature would be calc_merkle_root(leaf: uint256, path: address, proof: bytes32[160]) -> bytes32.
This widget can be used to calculate roots for both validation against known roots and creation of new ones (Plasma Cash uses both).
Additional note: a constant EMPTY_MERKLE_BRANCH[N] should be added that corresponds to the default case of a tree with all leaves set to bytes32(0) for depth N. This is useful for creation of new roots, e.g. calc_merkle_root(1, msg.sender, EMPTY_MERKLE_BRANCH[160] in our previous example.
A modification of this proposal:
leaf should be the node itself (the hashed content or value) and not the value at the child of the leaf node in the tree. This allows more flexibility in how this API would be used. The suggestion would look like:
assert self.root_hash == calc_root(key, keccak256(value), proof)
Instead of
assert self.root_hash == calc_root(key, value, proof)
Could also make the argument that key should have to be a 256-bit integer type and the path-length for the macro gets derived from the proof length, to make this as easy as possible to implement
This is a different design from Bryant's proposal but I developed an implementation and a test for people who use it before it gets built-in.
@nrryuya should add your assumption of the layout on the variables like I did. It's important to get those right.
The reason why I did mine the way I did was that the depth of the tree grows the size of the list, so deeper elements would get pushed down further instead of appended in front. This matches the other implementations within https://github.com/ethereum/py-trie by @pipermerriam et al. The choice of MSB in the keypath was done similarly.
Really important we are consistent with this.
NOTE: I added an SMT class in py-trie. You should probably test against that. (Also log any inconsistencies in my approach)
@fubuloubu
Thank you! I'll look into that.
@fubuloubu I think it's time to get this into master;) is my suggested implementation ok or should I use FVyper one?
I think we should talk about this one at the next meeting. A couple things I would like to add for flexibility and correctness:
key, leaf_hash, and proof[N] where N is fixed and the size of the key.calc_root and only calculate the root from proofs and leaf hash, not verify itcalc_root(..., function=hash_function)EMPTY_BRANCH: bytes32[N] = generate_empty_proof(N, function=hash_function)This would produce an example that looks like this:
N: constant(uint256) = # whatever depth the trie is
def validate_root(root_hash: bytes32, key: uint256, leaf: AnyType, proof: bytes32[N]):
leaf_hash: bytes32 = keccak256(leaf)
assert root_hash == calc_root(key, leaf_hash, proof, function=keccak256)
Meeting Notes: @fubuloubu please wrap the comment into a VIP, and then we can close this one in favour of this one.
Deprecating in favor of #1391, which is more well-defined.
Most helpful comment
By removing the root hash from this function, you can also compute a root hash for storage. It wouldn't lock you only to verification of the proof.
Also, in your implementation, why do you switch the hashes based on comparision? Does the merkle proof algorithm require strict ascending order?
Best use case of this would be for implementation of Plasma contracts and other similar proof-based paradigms