Currently in the SDK we store base account addresses (as well as any account address as raw bytes. This was chosen mainly for efficient storage reasons.
However, this decision has now resulted in complications when these fields are serialized using standard JSON / string marshalling over gPRC Gateway (for REST endpoints), or with jsonpb (used for encoding/decoding of CLI queries & txs).
In short, both jsonpb and gRPC gateway serialize raw bytes as base64 encoded strings by defualt, and we need all account.address fields to be serialized using our bech32 prefixed address strings.
There are 2 main strategies we can pursue then:
Override the string serialization in jsonpb and gRPC gateway for these specific account fields.
Jsonpb is supposed to have some support for custom serialization of bytes, but @aaronc pointed out that this actually does not seem to work in both directions (encoding as well as decoding), and we would need it for both directions.
gRPC gateway may require a fork, or us writing some custom plugin logic around gRPC gateway to handle this.
Advantages:
Disadvantages:
Refactor all address fields in protobuf to be stored as strings instead of bytes.
Advantages:
Disadvantages:
If we go with option 1, we need some protobuf annotations to indicate which type of address (account, validator, etc) is stored in the bytes field. This would need to be a clearly specified extension to protobuf JSON and is in my mind somewhat complex.
Option 2 is definitely simpler to implement and has the added benefit of obviating the need for client apps to properly handle bech32 <-> binary conversions and get the prefixes right for each chain. That would prevent a scenario where an app would mistakenly allow a user to use cosmos prefixes on a chain that has a different native prefix, essentially moving bech32 address validation to the protocol layer.
Not all state would need to be encoded as strings, but likely it is simpler to use strings for addresses almost everywhere in protobuf if we go with option 2. IBC by the way always encodes addresses as strings as far as I know. Possibly compression can mitigate the storage overhead.
Without knowing in too much detail how difficult of a refactor it would be, i'm in favor of Option 2, as it seems like given the known disadvantage of Option 1, Option 2 is the better long term strategy.
If we go with option 1, we need some protobuf annotations to indicate which type of address (account, validator, etc) is stored in the
bytesfield. This would need to be a clearly specified extension to protobuf JSON and is in my mind somewhat complex.
The more I dig-in, the more I am in favour of option-2. We might find some easy/hacky methods to gRPC gateway issues but option-2 looks clean comparatively.
Wondering if we can move forward with a decision here..
Noting that:
I'd like to propose we move forward with Option 2: Change the address fields on all relevant protobuf messages to be encoded as bech32 prefixed strings instead of raw bytes, thus alleviating this problem.
The need to inject the address prefix from somewhere when querying protobuf encoded accounts from chain is a pain indeed. You always need a magic source for this information, like:
public async getAccount(searchAddress: string): Promise<Account | null> {
const { prefix } = Bech32.decode(searchAddress);
const account = await this.queryClient.auth.account(searchAddress);
return account ? accountFromProto(account, prefix /* use the prefix from the request in the result */) : null;
}
If storage matters, I propose
A binary representation of a address's (prefix, data) pair: A typical Cosmos address contains 20 bytes of data and ~6 bytes of prefix. Bech32 encoded this is encoded to 45 bytes of ASCII. Alternatively, you can encode this as follows in 27 bytes of binary data: len(prefix) | prefix | data. From this you can encode to bech32 on demand.
I'm not promoting this option but want to put it on the table.
Most helpful comment
The more I dig-in, the more I am in favour of option-2. We might find some easy/hacky methods to gRPC gateway issues but option-2 looks clean comparatively.