Tendermint: Canonical types / consensus critical data serialzation

Created on 13 Jun 2020 · 3Comments · Source: tendermint/tendermint

Summary

The main spec should explicitly document how consensus critical data gets serialized (e.g before it gets signed or hashed). Additionally, test-vectors should be provided in the spec (not only in code/tests).

Problem Definition

It is critical for implementations in other languages to understand how exactly this data gets serialized. I'm aware of the move from amino to protobuf but this doesn't make this issue obsolete: protobuf does not guarantee deterministic serialization, especially not between different language implementations.

Proposal

It is essential to provide a brief yet complete specification of any consensus critical data serialzation. Providing a similar description and test-vectors in the spec as this would be cool:
https://github.com/davidiw/libra/blob/a101ebf5253a4c5da2cc1cf289147e562cd8c01c/common/canonical_serialization/README.md (note this doc is still WIP).

Alternatively, the spec can refer to the (soon to be) proto messages and explicitly mention in what deterministic fashion they are expected to be serialized (e.g. mention that the field order is binding, unknown fields are forbidden and maybe a few other things that potentially could introduce non-determinism).
Test-vectors should still be provided for every type and particularly for all edge cases. To increase readability, these test-vectors could also be referenced instead of explicitly listed in the spec.

For Admin Use

[ ] Not duplicate issue
[ ] Appropriate labels applied
[ ] Appropriate contributors tagged
[ ] Contributor assigned/self-assigned

docs encoding

Source

liamsi

❤3 👍2 👀1

Most helpful comment

@marbar3778 do you think we should move this issue to the spec repo?

tessr on 13 Nov 2020

👍2

All 3 comments

The main spec should explicitly document how consensus critical data gets serialized (e.g before it gets signed or hashed).

been going through the spec I hope to have a PR ready tomorrow for review. There are quite a few updates that will be made and a encoding section similar to what the blockchain section currently has will be made for all reactors.