Some context:
The scope of the bug:
Precondition: n >= 1, otherwise return INVALID. (Thanks @hwwhww for retrieving this from the specs)empty_aggregation_bits test case that described it as valid: https://github.com/ethereum/eth2.0-spec-tests/tree/v0.11.0/tests/minimal/phase0/operations/attestation/pyspec_tests/empty_aggregation_bits0xc00000.... signature is valid for no pubkeys.0x000... is not. 0xc000... is.SignatureSet, and then verifies it as a whole. This would allow the 0xc000 signature to pass without pubkeys I think. But not other invalid signatures.Action points:
Labeling it as a bug, although the spec is not necessarily wrong, it is definitely a case that should get attention to avoid clients splitting chains in the future.
@djrtwo @CarlBeek
There are going to be places in the protocol in the future where there will be a single aggregate signature, and not a list of aggregates; for example even in phase 1 there's the light client aggregate. And so in such a situation, the degenerate case is not going to be an empty list of attestations as it is elsewhere, it will be a single aggregate signature that is empty. To avoid requiring higher-layer extra logic around that case, it seems to me reasonable to make zero a valid signature for an empty list of pubkeys and messages. This case would get actually triggered when eg. somehow no one from the light client committee shows up to get included in a block.
We could add a separate rule banning empty attestations from phase 0 if desired; seems pointless to allow them.
We could add a separate rule banning empty attestations from phase 0 if desired; seems pointless to allow them.
But leaving them in does not break anything, and introducing breaking changes at this point should imo only be done to fix serious issues.
Preventing 0 attestations barely affects the scope of behavior and attack surface.
A rational actor would find more valuable attestations anyway, and not want to pack the block overly full, increasing the size of the block and time to compute.
As for the argument of “packing” the block with crap, the proposer could easily just put duplicate attestations (of high participation and thus more computational load) or just put a bunch of single bit attestations.
I am not concerned about the spam as much, but about it being a weird edge case that already has clients to disagree now, and likely in the future if we leave it as-is. Being more explicit about validity conditions would be good, especially because the BLS spec itself has its own requirements (the >= 1 case). And many libraries, such as the two Go version, simply crash or give undefined answers on empty pubkey list inputs. Having clients improvise to interpret the spec is what caused the current potential chain split.
But leaving them in does not break anything, and introducing breaking changes at this point should imo only be done to fix serious issues.
That's fair. Especially with compression, an empty attestation is close to zero overhead anyway. So I'd be happy with specifying that BLS aggregate signatures with 0 pubkeys are in general allowed.
Giving some feedback as per @protolambda's request :)
it seems to me reasonable to make zero a valid signature for an empty list of pubkeys and messages
This also seems reasonable to me. I assume the intention is that _only_ 0x00..00 is valid for 0 pubkeys? I prefer this otherwise we have some weird form of signature malleability.
I am not concerned about the spam
Same for me.
Being more explicit about validity conditions would be good
Agreed.
Being more explicit about validity conditions would be good, especially because the BLS spec itself has its own requirements (the >= 1 case).
This would make sense and I would agree with this, being explicit here is a definite plus. I would be ok with verifying that a signature was 0xc00... if no bits were set in the accompanying bitfield.
There are 2 layers where invalid signatures are handled:
Currently due to testing, fuzzing (and invalid signatures in deposits?) we have a wrapper to handle valid signatures at a low-level (i.e. point on the G2 curve) and invalid signatures but that we accept because of protocol or test/fuzzing requirements. We may add in the future another case with lazy loaded signatures (i.e. we stored it in a database on shutdown and reload it in the future).
type
BlsValueType* = enum
Real
OpaqueBlob
BlsValue*[N: static int, T] = object
# N = 48 for public keys and N = 96 for Signature
case kind*: BlsValueType
of Real:
blsValue*: T
of OpaqueBlob:
blob*: array[N, byte]
In the low-level layer, it is very easy to check if a signature is valid or not: it's not a point on the curve and the underlying library will not parse it. For example 0x0000... (full-zero) is invalid while 0xc000... is valid and a point at infinity. This is derived from elliptic curve maths and so does not need to be specified.
At the high-level layer, there is a huge potential for divergence and as @nisdas said, being explicit is important as are test cases.
In particular what is the zero signature:
0xc0000x0000Have we reached a conclusion yet?
From my point of view I think we have two options which I prefer option 1:
Request a change to the BLS specs such that if n = 0 i.e. PublicKeys = [] then [] aggregates to the identity. As it makes aggregation simpler i.e.
aggregate_public_key = 0
for key in public_keys:
aggregate_public_key += key
Rather than specifically handling the n = 0 case.
If this were the case we would simply have
[] and [0xc00...] (i.e. PublicKey at point 0)Leave BLS spec as is and handle in the Ethereum layer the case where AggregateSignature = 0x00 and publickeys = [] specifically to return true before interacting with the BLS layer.
I am opposed to Option 1 as it requires changes in the BLS spec (big or not, it's friction), and it doesn't solve for the real issue: clients have to get this right and not rely on libraries to work-around. If one library accepts 0 pubkeys, the other doesn't, and yet another is somewhere in between, then clients have to track those edge cases to stay in consensus. And properly implement work arounds.
Option 2 is better, as it does not require anything from the BLS spec, and is super straigtforward to implement and test:
0 participants is invalid case, unless signature is 0x000...> 0 participants is valid only if the signature is valid for those pubkeys.And better (IMHO) would be an Option 3, to remove one branch of decision making:
0 participants is an invalid case, no exceptions> 0 participants is valid only if the signature is valid for those pubkeys.I am opposed to Option 1 as it requires changes in the BLS spec (big or not, it's friction), and it doesn't solve for the real issue: clients have to get this right and not rely on libraries to work-around. If one library accepts 0 pubkeys, the other doesn't, and yet another is somewhere in between, then clients have to track those edge cases to stay in consensus. And properly implement work arounds.
From this option all logic would now be in the BLS layer and none in the Ethereum layer which is appealing. However, as you mentioned you would like public keys [] to return false which this will not.
I like this option mostly as I think the BLS spec is not implementation friendly in its current form and I would like to change it anyway.
And better (IMHO) would be an Option 3, to remove one branch of decision making:
0participants is an invalid case, no exceptions
> 0participants is valid only if the signature is valid for those pubkeys.
I do see the appeal of option 3 over option 2 in avoiding changes to BLS spec. And in that AggregateSignature 0x00.. will always be false and as the BLS spec is PublicKeys [] will also always be false.
From this option all logic would now be in the BLS layer and none in the Ethereum layer which is appealing. However, as you mentioned you would like public keys [] to return false which this will not.
I like this option mostly as I think the BLS spec is not implementation friendly in its current form and I would like to change it anyway.
Do you anticipate that it would be easy enough to get this included in the spec ? I would prefer to have everything under our control if possible,and the best way to do that would be to handle this in the application layer.
I would prefer option 3 as it is mostly straightforward to implement and I do not see any valid case for having an attestation with 0 participants. Its effectively just junk in the block that client's will have to process and verify.
Do you anticipate that it would be easy enough to get this included in the spec ? I would prefer to have everything under our control if possible,and the best way to do that would be to handle this in the application layer.
I'm confident at this stage if we wished to change the BLS spec it would get done. Though we are looking to freeze phase 0 and BLS spec is a bottleneck so maybe it's best to go with option 3 which as you say will also give us more control.
Additionally, if we go with option 3 and handle this case without interacting with BLS then it will be irrelevant if BLS makes any changes in this regard.
Ok, I'm making a PR for option 3 (empty set of participants is invalid as attestation to go onchain). If we need it to behave differently for some unforeseen phase 1+ case, we can simply start allowing the type of attestation we need, with backwards compatibility in regards to attestations already on-chain. Also, we indeed avoid being affected by BLS changes and library issues in any way, which is great.
What about the issue brought up by vitalik early in the thread? We know of cases where empty signatures will have to be included on chain.
In such a case, I suppose we have to be explicit -- e.g. if light_sig == 0x00: return at the top of process light sigs function.
EDIT: The fix in #1780 is specific to this one issue. If we go Option 3, we'll need to address each type of potentially empty aggregate specifically. This is probably fine
Copy-paste the discussion on #1799 https://github.com/ethereum/eth2.0-specs/pull/1799/files#r424421674
@mratsim:
Do we define the empty signature somewhere? Is that 0xc000...
@protolambda:
The python here would mean all zeroes (every ssz type is all zeroes by default). But agree that this is way to vague and can be confused with an empty signature.
@hwwhww:
I remember we hadEMPTY_SIGNATUREconstant in 2019. I would be happy to add it back.
@mratsim:
We can have,NO_SIGNATURE(0x0000...) andEMPTY_SIGNATURE(0xc000...) for clarity
@kirk-baird:
I've been treatingEMPTY_SIGNATUREas (0x0000...) as I believe that's what we called in it the spec in 2019. I'm happy to do a re-name and make
NO_SIGNATURE(0x0000...) (always verifies false)EMPTY_SIGNATURE(0xc000...)
However, we have to be careful here as
AggregatePublicKeys([])will always verify false according to the BLS Standard.
So
EMPTY_SIGNATUREwill only verify true against a list ofEMPTY_PUBLIC_KEYS. We must have1..*EMPTY_PUBLIC_KEYSverifying true. (Note: or some other combination of public keys that add to 0 mod r).
@mratsim:
OrEMPTY_SIGNATURE(0x0000) andZERO_SIGNATURE(0xc000) I'm not married to any name.
BeaconBlockBody.light_client_signature, when len(signer_pubkeys) == 0python
if len(signer_pubkeys) == 0:
assert block_body.light_client_signature == BLSSignature()
return
else:
assert bls.FastAggregateVerify(signer_pubkeys, signing_root, signature=block_body.light_client_signature)
ShardTransition.proposer_signature_aggregate, when len(shard_blocks) == 0assert bls.AggregateVerify(pubkeys, signing_roots, signature=transition.proposer_signature_aggregate)Optional 3 is good for attestation aggregation, however, for phase 1 cases, IMO I'm inclined to apply option 2 for these cases.
AggregateVerify(PKs, messages, signature)/FastAggregateVerify(PKs, message, signature) is invalid when PKs = [] according to the BLS standard, so we cannot set the non-signature to 0xc0... and make it pass the verification.def optional_fast_aggregate_verify(pubkeys, message, signature) -> bool:
if len(pubkeys) == 0:
return signature == NO_SIGNATURE
else:
return bls.FastAggregateVerify(pubkeys, message, signature)
def optional_aggregate_verify(pubkeys, messages, signature) -> bool:
if len(pubkeys) == 0:
return signature == NO_SIGNATURE
else:
return bls.AggregateVerify(pubkeys, messages, signature)
So it would be crystal clear that PKs == [] is possible. 🙂
Yes, being explicit like that is great, and avoids weird edge cases between implementations in the BLS spec, as readers have many different assumptions about 0 pubkeys, and the BLS spec does not cover it. With phase0 I prefer option 3, as it is easier to open it back up, than restrict it (since existing signatures still would need to be valid). For phase 1 that approach looks good.
@protolambda I think it's okay to close this issue?
@hwwhww yes, this was resolved. There are still surprises with the infinity-point pubkey/signature for implementers though, especially when changing to new BLS libraries (BLST). @paulhauner was looking into this. If you've any edge cases or things that should be documented/tested, please make a new issue and we'll look into it.
Most helpful comment
Copy-paste the discussion on #1799 https://github.com/ethereum/eth2.0-specs/pull/1799/files#r424421674
So far, we have two non-signature cases in phase 1:
BeaconBlockBody.light_client_signature, whenlen(signer_pubkeys) == 0python if len(signer_pubkeys) == 0: assert block_body.light_client_signature == BLSSignature() return else: assert bls.FastAggregateVerify(signer_pubkeys, signing_root, signature=block_body.light_client_signature)ShardTransition.proposer_signature_aggregate, whenlen(shard_blocks) == 0assert bls.AggregateVerify(pubkeys, signing_roots, signature=transition.proposer_signature_aggregate)My 2 cents
Optional 3 is good for attestation aggregation, however, for phase 1 cases, IMO I'm inclined to apply option 2 for these cases.
AggregateVerify(PKs, messages, signature)/FastAggregateVerify(PKs, message, signature)is invalid whenPKs = []according to the BLS standard, so we cannot set the non-signature to 0xc0... and make it pass the verification.So it would be crystal clear that
PKs == []is possible. 🙂