For initial introduction of options in GossipSub, please have a look at: https://github.com/libp2p/go-libp2p-pubsub/pull/359
The TLDR:
What now?
optional in protobuf definition)cc @jrhea @AgeManning @raulk
In case of the golang implementation, this would look like:
WithMessageSignaturePolicy(LaxNoSign) option, instead of WithStrictSignatureVerification(false), WithMessageSigning(false)pubsub.signID is not nil), but Signature is not (message signing is not enabled, like now)WithMessageSignaturePolicy(StrictNoSign), and WithNoAuthor().This can be done without changing consensus itself, but still affects the network enough to take extra care, enough time to update, and iterate on the plan before committing to it. It's just an unused option in golang pubsub for now, after some initial feedback.
Clients can help by:
Tracking table:
| Q | Lighthouse | Prysm | Teku | Nimbus | Lodestar |
|---|------------|-------|------|--------|----------|
| Is Signature required to be present? | N | N | N | N | N |
| Is Signature verified, if present? | N | Y | N | N | Y |
| Is Seqno required to be present? | N | N | N | N | N |
| Is From required to be present? | N | N | N | N | N |
| Do you set the From field? | Y | Y | Y | Y | Y |
| Do you randomize/zero the From field? | N | N | N | N | N |
| Do you set the Seqno field? | Y | Y | Y | Y | Y |
| Do you randomize the Seqno field?* | Y** | N | N | N | Y |
| Do you modify From/Seqno data upon propagation? | N | N | N | N | N |
*: Prysm inits with nano-second time, Jvm inits randomly. They increment by 1 (causes various minor privacy issues). (answered Yes is more like a No in that regard)
**: Lighthouse zeroes From/Seqno when authoring a message
Here's Nimbus (nim-libp2p) current behavior:
@dryajov, sorry, maybe I should have added more questions. When you say "optional, disabled for eth2" for the signature, does that mean you reject messages with a signature, or just ignore the signature?
It means that we don't require the signature to be present and if it is present we do not verify it for eth2. However, nim-libp2p allows to configure this, so one can enable it if desired.
js-libp2p-gossipsub+lodestar behavior:
Teku:
Based on the answers, it looks like none of the implementations modify seqno on propagation; however, I have seen it happen on Altona. Strange...

@jrhea thanks for looking into the data! Those may be from the validators that got slashed, and were publishing double attestations. There's a chance you produce the same vote (not slashable, by luck) when running duplicate nodes, resulting in two messages with same contents, but different seqno.
Can you show the distribution of seqno for those messages? We can narrow down which clients are affected based on the seqno range (prysm uses initial nanosecond time, teku inits with random and keeps incrementing, others always random)
Most of ours looks correct. On altona we set a 0 multihash/PeerId as the author rather than randomizing it. We dont modify the sequence number when propagating.
Those may be from the validators that got slashed, and were publishing double attestations. There's a chance you produce the same vote (not slashable, by luck) when running duplicate nodes, resulting in two messages with same contents, but different seqno.
This dataset only contains blocks

Can you show the distribution of seqno for those messages? We can narrow down which clients are affected based on the seqno range (prysm uses initial nanosecond time, teku inits with random and keeps incrementing, others always random)
Sure can...I had something like that in the works, but didn't get a chance to circle back to it today. Will take a look tonight.
This dataset only contains blocks
Whoops, my bad. The same can still happen with double block proposals though.
Will take a look
Awesome, thanks
Creating a compelling visual to provide the distribution of seqnos by client was non-trival (or i am just too tired). Instead, I have a visual of the distribution of message id's with multiple seqnos by client. btw, I have issues staying connected with Teku 😶 that's why it is not in the data.
To orient you, this is what the data looks like in a table after some massaging:

Here is the same data visualized with a heatmap:

At first, it looks like Lighthouse is the main culprit, but let's see what fraction of total messages were received by each client:

The distribution isn't exactly uniform so I normalize it and attempt the heatmap visualization again:

This normalized view of the data doesn't provide overwhelming evidence of anything, but it does indicate that Nimbus might be the culprit (I wouldn't exactly bet my life on it though). One more interesting fact, there isn't a single message-id with multiple sequence numbers that wasn't delivered by Nimbus at least one time. Remember the table I posted at the beginning of this comment? None of the rows have a 0 in the Nimbus column 🤷♂️.
Otoh, these modified message-ids with modified seqno's only make up ~ 3% of the messages received from nimbus so I don't know how compelling this really is. That being said, it is all i have right now and maybe @dryajov wouldn't mind double checking the nimbus logic to see if it possibly is modifying seqno. In the meantime, I will be in a straight jacket in a padded room.
@jrhea, any way we can check the timestamp of the message?
We only recently (~2w ago) added proper message id generation, before it used to be the default seqno+peerid, so we might have been forwarding/generating duplicate messages. There was also another (~1w ago) recent change, which reverted our seqno to a simple counter that restarts on each run, before it used to be a randomly generated number.
Sorry for the vague dates and I'll try to get to the bottom of this in the morning, but I suspect that any or both of this changes might have had something to do with it. Thanks for the heads up 👍
@jrhea, any way we can check the timestamp of the message?
Sure thing! Here are the min and max timestamps for nimbus messages that I collected
min: Saturday, July 18, 2020 11:52:42.804 AM
max: Monday, July 20, 2020 10:58:31.662 AM
Not sure if that lines up or if the changes made it to Altona nodes. Lmk if you need more info about the nodes.
Forgot to update lighthouse's side:
@AgeManning Thanks, updated the table
Teku is now the only one in the table still requiring one of the fields: the seqno is still required to be present. @Nashatyrev is this still the case? Can you check if the table matches Teku?
Once Teku allows the seqno to be missing, all clients do not require the fields anymore, and we can plan a grace period: we give users time to update their nodes, and then start leaving out the fields of the messages after the period is over. During this grace period we can do some testing on a devnet (maybe reuse https://github.com/protolambda/fafafa) to see if the new behavior works as expected.
@protolambda
Update from Teku:
Teku:
@Nashatyrev is there an option to omit the seqno yet? It looks like it is still setting it (a new one is generated if you ask it to not have a seqno): https://github.com/libp2p/jvm-libp2p/blob/c0cfb97e68f83f363063c15ca7614df95aa4ce55/src/main/kotlin/io/libp2p/pubsub/PubsubApiImpl.kt#L46
In the near future messages with seqno will be invalid, so it is important to coordinate the changes and current behavior of clients here. With testing soon hopefully as well, working on that.
Edit: also, upon propagation, these fields should be omitted (completely, i.e. exclude the optional protobuf key-value completely, not just an empty value)
Hey @protolambda, no Teku doesn't have this option yet. To make a PR I'd like to clarify a couple of questions below:
In the near future messages with seqno will be invalid
Does the same apply to From and Signature ?
Edit: also, upon propagation, these fields should be omitted
I see 2 options with this regard:
From, Sig or SeqNo field is not nil then the message assumed invalid (and correspondingly not propagated) From, Sig and SeqNo fields of inbound message and then propagate it. (message is not assumed invalid) I thought we were going to stick to the option (1). Am I missing something?
@Nashatyrev
Does the same apply to From and Signature ?
Yes, the intention is to omit all unused (w.r.t. eth2) optional (w.r.t. gossipsub) fields.
I see 2 options with this regard:
Intention is indeed option (1). We could do option (2) without breaking anything, but if nobody adds these fields to begin with, that is much cleaner (no extra data that is known to be unused for good reasons).
What I meant with "upon propagation" is that even if you didn't create the message, and are propagating it, that the fields should not be added to the message (assuming the message was valid, and didn't contain the fields to begin with).
Also, to clarify just in case, there is a difference between an empty value, and a nil value: the gossip message fields are optional and we should be able to completely omit the key-value pair. In go it's a difference between []byte (empty value, present key) and nil (fully omitted) as value. I'm not sure how it works in Java, maybe a similar pattern with null or Optional works.
One point of concern though is that if fields are unrecognized (i.e. a gossipsub implementation using a protobuf definition that does not include these optional fields), and if the fields are still passed along, then we have an issue when it comes to peer-scoring: a peer may unknowingly propagate messages with unasked for fields. This may be the case with Nimbus.
Also, now I wonder what happens if other unknown indices are used as keys for protobuf entries, and happen to be propagated (I think in Go this would happen through the XXX_unrecognized field that holds the unrecognized data). These could function like "trackers" attached to messages, and with some malicious intent, help identify the peer relations between external nodes.
I'll raise this during the networking call, some thought and questions from others definitely helps (thanks @Nashatyrev). And I am personally not a fan of protobuf for protocols like this.
@protolambda just merged the Teku PR above.
Teku now doesn't set From, Sig and SeqNo fields.
However it still doesn't deny messages with fields and propagating them 'as is'. This is a temporary solution and we will change Teku to drop such messages as soon as all clients update. Patching inbound messages (nullify their fields) for further propagation would require some extra effort on Libp2p and Teku side so we'd better wait for other clients to update.
Opened a libp2p specs PR to standardize the policies around these fields and signature checking: https://github.com/libp2p/specs/pull/294
This was implemented in all clients, as option in libp2p, and included in the libp2p pubsub spec. Closing this old issue.