Currently the specs mention that "message" should be bytes32:
def hash_to_G2(message: bytes32, domain: uint64) -> [uint384]:
# Initial candidate x coordinate
x_re = int.from_bytes(hash(message + bytes8(domain) + b'\x01'), 'big')
x_im = int.from_bytes(hash(message + bytes8(domain) + b'\x02'), 'big')
x_coordinate = Fq2([x_re, x_im]) # x = x_re + i * x_im
# Test candidate y coordinates until a one is found
while 1:
y_coordinate_squared = x_coordinate ** 3 + Fq2([4, 4]) # The curve is y^2 = x^3 + 4(i + 1)
y_coordinate = modular_squareroot(y_coordinate_squared)
if y_coordinate is not None: # Check if quadratic residue found
return multiply_in_G2((x_coordinate, y_coordinate), G2_cofactor)
x_coordinate += Fq2([1, 0]) # Add 1 and try again
This prevents using bigger message as input, for example the third message that was proposed for testing:
MESSAGES = [
b'message',
b'Bigger message',
b'Very .............. long ............. message .... with entropy: 1234567890-beacon-chain'
]
Assuming there are situations where longer messages are needed, we would need to hash once before passing to hash_to_G2. hash_to_G2 will also re-hash inside.
Only impact is type signature.
Shouldn't this parameter be treated as a message hash? Maybe rename it to message_hash for clarity?
the thing that is commonly used in that place across signature algos is some hash of the message -- i would support a renaming if it had broad support -- otherwise this is the kind of thing where if you work in the domain you know what it means from context; i agree it is not perfectly precise :)
It's also a minor win to force constant message size to reduce variability of runtime ofhash_to_G2.
+1 message_hash
Agree on message_hash. Another thing to keep in mind is that the hash that gets passed in here is sometimes a simple hash of a byte array, but it can also be an SSZ tree hash. So using message hash here is the right level of abstraction.