Eth2.0-specs: GossipSub: disabling unused From/Seqno/Signature

Created on 23 Jul 2020  ·  23Comments  ·  Source: ethereum/eth2.0-specs

For initial introduction of options in GossipSub, please have a look at: https://github.com/libp2p/go-libp2p-pubsub/pull/359

The TLDR:

  • fields in the Message protobuf in gossipsub are optional
  • some clients use it, some randomize it, some verify it
  • it's a privacy issue
  • with enough effort, you can cause a little peer-scoring chaos in the network by crafting bad messages

What now?

  • Not part of the planned spec release. No immediate breaking changes.
  • Need to coordinate to disable verification of From/Seqno/Signature/Key (we don't use it, message-ID is content-based, it only uses Data), and accept it if the data is omitted from any message (already optional in protobuf definition)
  • Proposal: Give it a week or two for every Medalla node to update into no-verification mode, where these fields can be nil
  • Then later move into strict mode, where messages that use any of the fields are not accepted anymore.
  • Result: avoid privacy and verification issues of fields that are not used in Eth2.

cc @jrhea @AgeManning @raulk

libp2p networking privacy

All 23 comments

In case of the golang implementation, this would look like:

  • Start using the WithMessageSignaturePolicy(LaxNoSign) option, instead of WithStrictSignatureVerification(false), WithMessageSigning(false)

    • From and Seqno will still be put in messages (pubsub.signID is not nil), but Signature is not (message signing is not enabled, like now)

  • A week or two later, when the network is ready: use WithMessageSignaturePolicy(StrictNoSign), and WithNoAuthor().

    • From, Seqno, Signature will all be nil (i.e. optional protobuf field is omitted).

    • Messages of clients that did not update, and are still producing Seqno and From, will be dropped.

This can be done without changing consensus itself, but still affects the network enough to take extra care, enough time to update, and iterate on the plan before committing to it. It's just an unused option in golang pubsub for now, after some initial feedback.

Clients can help by:

  • share their current behavior (quick yes/no answers):

    • Is Signature required to be present?

    • Is Signature verified, if present?

    • Is Seqno required to be present?

    • Is From required to be present?

    • Do you set the From field?

    • Do you randomize/zero the From field?

    • Do you set the Seqno field?

    • Do you randomize the Seqno field?

    • Do you modify From/Seqno data upon propagation?

  • share your thoughts about updating

Tracking table:

| Q | Lighthouse | Prysm | Teku | Nimbus | Lodestar |
|---|------------|-------|------|--------|----------|
| Is Signature required to be present? | N | N | N | N | N |
| Is Signature verified, if present? | N | Y | N | N | Y |
| Is Seqno required to be present? | N | N | N | N | N |
| Is From required to be present? | N | N | N | N | N |
| Do you set the From field? | Y | Y | Y | Y | Y |
| Do you randomize/zero the From field? | N | N | N | N | N |
| Do you set the Seqno field? | Y | Y | Y | Y | Y |
| Do you randomize the Seqno field?* | Y** | N | N | N | Y |
| Do you modify From/Seqno data upon propagation? | N | N | N | N | N |

*: Prysm inits with nano-second time, Jvm inits randomly. They increment by 1 (causes various minor privacy issues). (answered Yes is more like a No in that regard)
**: Lighthouse zeroes From/Seqno when authoring a message

Here's Nimbus (nim-libp2p) current behavior:

  • Is a signature required to be present? - No

    • optional, disabled for eth2

  • Are signature verified, if present? - No

    • optional, disabled for eth2

  • Is Seqno required to be present? - No
  • Is From required to be present? - No
  • Do you set the From field? - Yes
  • Do you randomize/zero the From field? - No
  • Do you set the Seqno field? - Yes
  • Do you randomize the Seqno field? - No
  • Do you modify From/Seqno data upon propagation? - No

@dryajov, sorry, maybe I should have added more questions. When you say "optional, disabled for eth2" for the signature, does that mean you reject messages with a signature, or just ignore the signature?

It means that we don't require the signature to be present and if it is present we do not verify it for eth2. However, nim-libp2p allows to configure this, so one can enable it if desired.

js-libp2p-gossipsub+lodestar behavior:

  • Is a signature required to be present? - N
  • Are signature verified, if present? - Y
  • Is Seqno required to be present? - N
  • Is From required to be present? - N
  • Do you set the From field? - Y
  • Do you randomize/zero the From field? - N
  • Do you set the Seqno field? - Y
  • Do you randomize the Seqno field? - Y
  • Do you modify From/Seqno data upon propagation? - N

Teku:

  • Is a signature required to be present? - N
  • Are signature verified, if present? - N
  • Is Seqno required to be present? - Y
  • Is From required to be present? - N
  • Do you set the From field? - Y
  • Do you randomize/zero the From field? - N
  • Do you set the Seqno field? - Y
  • Do you randomize the Seqno field? - Y*
  • Do you modify From/Seqno data upon propagation? - N
    (*) Randomize on startup, then increment

Based on the answers, it looks like none of the implementations modify seqno on propagation; however, I have seen it happen on Altona. Strange...

image

@jrhea thanks for looking into the data! Those may be from the validators that got slashed, and were publishing double attestations. There's a chance you produce the same vote (not slashable, by luck) when running duplicate nodes, resulting in two messages with same contents, but different seqno.

Can you show the distribution of seqno for those messages? We can narrow down which clients are affected based on the seqno range (prysm uses initial nanosecond time, teku inits with random and keeps incrementing, others always random)

Most of ours looks correct. On altona we set a 0 multihash/PeerId as the author rather than randomizing it. We dont modify the sequence number when propagating.

Those may be from the validators that got slashed, and were publishing double attestations. There's a chance you produce the same vote (not slashable, by luck) when running duplicate nodes, resulting in two messages with same contents, but different seqno.

This dataset only contains blocks

image

Can you show the distribution of seqno for those messages? We can narrow down which clients are affected based on the seqno range (prysm uses initial nanosecond time, teku inits with random and keeps incrementing, others always random)

Sure can...I had something like that in the works, but didn't get a chance to circle back to it today. Will take a look tonight.

This dataset only contains blocks

Whoops, my bad. The same can still happen with double block proposals though.

Will take a look

Awesome, thanks

Creating a compelling visual to provide the distribution of seqnos by client was non-trival (or i am just too tired). Instead, I have a visual of the distribution of message id's with multiple seqnos by client. btw, I have issues staying connected with Teku 😶 that's why it is not in the data.

To orient you, this is what the data looks like in a table after some massaging:

image

Here is the same data visualized with a heatmap:

image

At first, it looks like Lighthouse is the main culprit, but let's see what fraction of total messages were received by each client:

image

The distribution isn't exactly uniform so I normalize it and attempt the heatmap visualization again:

image

This normalized view of the data doesn't provide overwhelming evidence of anything, but it does indicate that Nimbus might be the culprit (I wouldn't exactly bet my life on it though). One more interesting fact, there isn't a single message-id with multiple sequence numbers that wasn't delivered by Nimbus at least one time. Remember the table I posted at the beginning of this comment? None of the rows have a 0 in the Nimbus column 🤷‍♂️.

Otoh, these modified message-ids with modified seqno's only make up ~ 3% of the messages received from nimbus so I don't know how compelling this really is. That being said, it is all i have right now and maybe @dryajov wouldn't mind double checking the nimbus logic to see if it possibly is modifying seqno. In the meantime, I will be in a straight jacket in a padded room.

@jrhea, any way we can check the timestamp of the message?

We only recently (~2w ago) added proper message id generation, before it used to be the default seqno+peerid, so we might have been forwarding/generating duplicate messages. There was also another (~1w ago) recent change, which reverted our seqno to a simple counter that restarts on each run, before it used to be a randomly generated number.

Sorry for the vague dates and I'll try to get to the bottom of this in the morning, but I suspect that any or both of this changes might have had something to do with it. Thanks for the heads up 👍

@jrhea, any way we can check the timestamp of the message?

Sure thing! Here are the min and max timestamps for nimbus messages that I collected

min: Saturday, July 18, 2020 11:52:42.804 AM
max: Monday, July 20, 2020 10:58:31.662 AM

Not sure if that lines up or if the changes made it to Altona nodes. Lmk if you need more info about the nodes.

Forgot to update lighthouse's side:

  • Is Signature required to be present?: N (at some point we can reject messages that have signatures)
  • Is Signature verified, if present?: N
  • Is Seqno required to be present?: N (at some point we can reject messages that have seq no)
  • Is From required to be present?: N (at some point we can reject messages that have from )
  • Do you set the From field?: Y (we use an identity peer_id - but will remove this as other clients appear to support no from field)
  • Do you randomize/zero the From field?: N
  • Do you set the Seqno field?: Y (we use a random number, but will remove this, once supported by Teku)
  • Do you randomize the Seqno field?: Y
  • Do you modify From/Seqno data upon propagation?: N

@AgeManning Thanks, updated the table

Teku is now the only one in the table still requiring one of the fields: the seqno is still required to be present. @Nashatyrev is this still the case? Can you check if the table matches Teku?

Once Teku allows the seqno to be missing, all clients do not require the fields anymore, and we can plan a grace period: we give users time to update their nodes, and then start leaving out the fields of the messages after the period is over. During this grace period we can do some testing on a devnet (maybe reuse https://github.com/protolambda/fafafa) to see if the new behavior works as expected.

@protolambda
Update from Teku:

Teku:

  • Is Seqno required to be present? - N

@Nashatyrev is there an option to omit the seqno yet? It looks like it is still setting it (a new one is generated if you ask it to not have a seqno): https://github.com/libp2p/jvm-libp2p/blob/c0cfb97e68f83f363063c15ca7614df95aa4ce55/src/main/kotlin/io/libp2p/pubsub/PubsubApiImpl.kt#L46

In the near future messages with seqno will be invalid, so it is important to coordinate the changes and current behavior of clients here. With testing soon hopefully as well, working on that.

Edit: also, upon propagation, these fields should be omitted (completely, i.e. exclude the optional protobuf key-value completely, not just an empty value)

Hey @protolambda, no Teku doesn't have this option yet. To make a PR I'd like to clarify a couple of questions below:

In the near future messages with seqno will be invalid

Does the same apply to From and Signature ?

Edit: also, upon propagation, these fields should be omitted

I see 2 options with this regard:

  1. If From, Sig or SeqNo field is not nil then the message assumed invalid (and correspondingly not propagated)
  2. Nullify From, Sig and SeqNo fields of inbound message and then propagate it. (message is not assumed invalid)

I thought we were going to stick to the option (1). Am I missing something?

@Nashatyrev

Does the same apply to From and Signature ?

Yes, the intention is to omit all unused (w.r.t. eth2) optional (w.r.t. gossipsub) fields.

I see 2 options with this regard:

Intention is indeed option (1). We could do option (2) without breaking anything, but if nobody adds these fields to begin with, that is much cleaner (no extra data that is known to be unused for good reasons).

What I meant with "upon propagation" is that even if you didn't create the message, and are propagating it, that the fields should not be added to the message (assuming the message was valid, and didn't contain the fields to begin with).

Also, to clarify just in case, there is a difference between an empty value, and a nil value: the gossip message fields are optional and we should be able to completely omit the key-value pair. In go it's a difference between []byte (empty value, present key) and nil (fully omitted) as value. I'm not sure how it works in Java, maybe a similar pattern with null or Optional works.


One point of concern though is that if fields are unrecognized (i.e. a gossipsub implementation using a protobuf definition that does not include these optional fields), and if the fields are still passed along, then we have an issue when it comes to peer-scoring: a peer may unknowingly propagate messages with unasked for fields. This may be the case with Nimbus.

Also, now I wonder what happens if other unknown indices are used as keys for protobuf entries, and happen to be propagated (I think in Go this would happen through the XXX_unrecognized field that holds the unrecognized data). These could function like "trackers" attached to messages, and with some malicious intent, help identify the peer relations between external nodes.

I'll raise this during the networking call, some thought and questions from others definitely helps (thanks @Nashatyrev). And I am personally not a fan of protobuf for protocols like this.

@protolambda just merged the Teku PR above.
Teku now doesn't set From, Sig and SeqNo fields.
However it still doesn't deny messages with fields and propagating them 'as is'. This is a temporary solution and we will change Teku to drop such messages as soon as all clients update. Patching inbound messages (nullify their fields) for further propagation would require some extra effort on Libp2p and Teku side so we'd better wait for other clients to update.

Opened a libp2p specs PR to standardize the policies around these fields and signature checking: https://github.com/libp2p/specs/pull/294

This was implemented in all clients, as option in libp2p, and included in the libp2p pubsub spec. Closing this old issue.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

Nashatyrev picture Nashatyrev  ·  4Comments

protolambda picture protolambda  ·  4Comments

paulhauner picture paulhauner  ·  4Comments

hwwhww picture hwwhww  ·  3Comments

benjaminion picture benjaminion  ·  3Comments