See https://github.com/tendermint/tendermint/issues/5796 and https://github.com/tendermint/tendermint/issues/3920#issuecomment-748963728
Large messages (or many messages) produced by other reactors and scheduled to be send can temporarily block consensus reactor from making progress.
Either libs/flow library does not perform as expected or p2p/conn/connection scheduling logic is invalid.
No halting. Tendermint top priority is exchanging votes and making blocks with a few transactions + evidence (if any). Consensus reactor messages should have a top priority, while other (e.g. mempool gossip, evidence) should have a lower priority.
This is basically #2888. The new P2P stack will have separate queues per reactor channel.
This is basically #2888. The new P2P stack will have separate queues per reactor channel.
How's that? The issue here is incorrect dispatching (sending) while multiplexing over single TCP stream, while #2888 is about individual Reactor#Receive blocking receiving messages => sending != receiving. But you're right that if we adopt QUIC (independent streams), this issue will go away.
Ah, got it -- you're right, different issue. New P2P stack will handle this as well, by having separate outbound queues per peer with some scheduling policy, and dropping messages if a peer can't keep up to avoid blocking reactors.
Most helpful comment
Ah, got it -- you're right, different issue. New P2P stack will handle this as well, by having separate outbound queues per peer with some scheduling policy, and dropping messages if a peer can't keep up to avoid blocking reactors.