Eth2.0-specs: hash_to_G2 input message size: Why bytes32?

Created on 5 Feb 2019 · 5Comments · Source: ethereum/eth2.0-specs

Currently the specs mention that "message" should be bytes32:

def hash_to_G2(message: bytes32, domain: uint64) -> [uint384]:
    # Initial candidate x coordinate
    x_re = int.from_bytes(hash(message + bytes8(domain) + b'\x01'), 'big')
    x_im = int.from_bytes(hash(message + bytes8(domain) + b'\x02'), 'big')
    x_coordinate = Fq2([x_re, x_im])  # x = x_re + i * x_im

    # Test candidate y coordinates until a one is found
    while 1:
        y_coordinate_squared = x_coordinate ** 3 + Fq2([4, 4])  # The curve is y^2 = x^3 + 4(i + 1)
        y_coordinate = modular_squareroot(y_coordinate_squared)
        if y_coordinate is not None:  # Check if quadratic residue found
            return multiply_in_G2((x_coordinate, y_coordinate), G2_cofactor)
        x_coordinate += Fq2([1, 0])  # Add 1 and try again

This prevents using bigger message as input, for example the third message that was proposed for testing:

MESSAGES = [
    b'message',
    b'Bigger message',
    b'Very .............. long ............. message .... with entropy: 1234567890-beacon-chain'
]

Assuming there are situations where longer messages are needed, we would need to hash once before passing to hash_to_G2. hash_to_G2 will also re-hash inside.

Only impact is type signature.

Source

mratsim

All 5 comments

Shouldn't this parameter be treated as a message hash? Maybe rename it to message_hash for clarity?

mkalinin on 5 Feb 2019

👍1

the thing that is commonly used in that place across signature algos is some hash of the message -- i would support a renaming if it had broad support -- otherwise this is the kind of thing where if you work in the domain you know what it means from context; i agree it is not perfectly precise :)

ralexstokes on 7 Feb 2019

👍1

It's also a minor win to force constant message size to reduce variability of runtime ofhash_to_G2.

+1 message_hash

djrtwo on 7 Feb 2019

Agree on message_hash. Another thing to keep in mind is that the hash that gets passed in here is sometimes a simple hash of a byte array, but it can also be an SSZ tree hash. So using message hash here is the right level of abstraction.