Tendermint: Transactions are stuck in mempool and not getting propagated to other nodes

Created on 6 Jul 2018  路  24Comments  路  Source: tendermint/tendermint

BUG REPORT

Tendermint version
0.22.0-fa05b039

ABCI app
Custom Internal

Environment:

  • Ubuntu 64-bit
  • 4 Validator Nodes
  • Tx size ~ 80KB

What happened:
After a not deterministic time period transactions which were successfully validated by CheckTx are not getting gossiped or deleted from mempool and the node cant be used anymore.
What you expected to happen:
After a tx is validated by CheckTx it should be gossiped to other nodes and the proposer node calls BeginBlock.
How to reproduce it:
We can reproduce this consistently without a deterministic time period with the setup described above

Config:
empty blocks=false
p2p-params as described in "running in production"

/dump_consensus_state output for consensus bugs
consensus node 1:
https://gist.github.com/yuomii/5acbd97d2935b7d1dd61d98e2ce5adb8
consensus node 2:
https://gist.github.com/yuomii/ab11b2914753080b39cd3923b07f5d6d
consensus node 3:
https://gist.github.com/yuomii/23f74831f9031b213186c00d83fd35eb
consensus node 4:
https://gist.github.com/yuomii/215862af5e899092987b41852e13018c

Logs:
Full Logs (log level debug):
node 1:
https://gist.github.com/yuomii/918ed5271a909613b37649f1b8d152f9
node 2:
https://gist.github.com/yuomii/d80a783cdb088039e1cc78689917c697
node 3:
https://gist.github.com/yuomii/96b0e482d697d46e27648d3e73c36ae8
node 4:
https://gist.github.com/yuomii/2082b674c996386499144c819785a164

Seems to be related to #1875

EDIT:
dump_consensus was called 2h later. After ~2h we noticed that blocks were beeing build again.

mempool bug

Most helpful comment

Merged to develop. Will be shipped with 0.22.5 or next breaking release.

All 24 comments

We can reproduce this consistently without a deterministic time period with the setup described above

Can you clarify with some details around how many txs you are sending and to which nodes and at what rate?

Also it seems some of the logs are out of order? Height and time flow backwards at points. Is this raw output from Tendermint?

This is produced by a single testcase setup in an infinite loop. The test has different payloads 3 very small ones and two big ones around EDIT: ~80KB. The rate is around 1tx per 2 seconds. I cant tell you exactly which payload is send to which node because in our application we check if the current node we are broadcasting to is in syncing state. If thats the case we switch the node we brodcast to on the fly.

This is raw output from tendermint

i replaced (find/replace) the real node IPs in the log files because of NDA and stuff. If thats a problem for you we need to find a way to get the log files to you over a private channel

I recheched the logs and picked the biggest tx. The big ones are actually up to ~80KB

any updates on this?

Currently retesting with lower tx size:

Proposer stuck in step 2
Other nodes stuck in step 1

4 Nodes.
4 Validators.

Debug logs will follow. had commented on https://github.com/tendermint/tendermint/issues/1961 but deleting the comments from there, as we don't see the "insufficient voting power" message.

When we restart the stuck proposer, the chain progresses and processes the unconfirmed txs, until it get's stuck again at proposal.

Currently staring at

// IncrementAccum increments accum of each validator and updates the
// proposer. Panics if validator set is empty.
func (valSet *ValidatorSet) IncrementAccum(times int) {
    // Add VotingPower * times to each validator and order into heap.
    validatorsHeap := cmn.NewHeap()
    for _, val := range valSet.Validators {
        // check for overflow both multiplication and sum
        val.Accum = safeAddClip(val.Accum, safeMulClip(val.VotingPower, int64(times)))
        validatorsHeap.PushComparable(val, accumComparable{val})
    }

    // Decrement the validator with most accum times times
    for i := 0; i < times; i++ {
        mostest := validatorsHeap.Peek().(*Validator)
        // mind underflow
        mostest.Accum = safeSubClip(mostest.Accum, valSet.TotalVotingPower())

        if i == times-1 {
            valSet.Proposer = mostest
        } else {
            validatorsHeap.Update(mostest, accumComparable{mostest})
        }
    }
}

in validator_set.go (0.22.4 tag) but according to melekes, negative accum is no issue (while we have no clue what it actually is).

@melekes as discussed, here the text from riot and the logfiles

  • Redacted most of our internal transactions
  • cut everything until first appearance of height=100
  • stuck at 102->103
  • contains dump_consensus of all 4 nodes
  • contains genesis and config.toml of all 4 nodes
  • all 4 nodes are validators

stuck.zip

Note: here's a selection from the textwall of riot

srmo
what is the meaning of "accum" on a Validator? What does it mean, if it is negative?
Asking in the light of https://github.com/tendermint/tendermint/issues/1961 and looking at validator_set.go mostest.Accum = safeSubClip(mostest.Accum, valSet.TotalVotingPower())
so for example a validator with acumm 20 and totalVotingPower 40 gets -20 as accum.

melekes (@melekes:matrix.org)
it's used for proposer selection
https://github.com/tendermint/tendermint/blob/develop/docs/spec/reactors/consensus/proposer-selection.md
negative is fine

srmo
Hm...still not really understanding how accum plays into this. Especially when seeing multiple validators with negative accum. Currently just desparate why the proposer gets stuck, but after restarting the proposer, the chain is able to progress.

Zaki
The validator max accum in each round is the proposer

srmo
Hm, this seems to be wrong, as often we find that a proposer with negative accum is chosen

@Zaki
While at least one other validator has a positive accum

srmo

        "validators": [
          {
            "address": "230B52B0BE5A7F0A6F5AAD59876FA500729AAB2D",
            "pub_key": {
              "type": "tendermint/PubKeyEd25519",
              "value": "fggKTpLEA7aK9Nt0EGTwLj5R7AvVNZOWfadkWP1tMQo="
            },
            "voting_power": "10",
            "accum": "-10"
          },
          {
            "address": "723E9664EB47538941DA80C1AADC7786FC48B392",
            "pub_key": {
              "type": "tendermint/PubKeyEd25519",
              "value": "TvxexDbck6A3Ds2dLxuI2XlCigBUJuTZQDaSkElX37c="
            },
            "voting_power": "10",
            "accum": "-10"
          },
          {
            "address": "A5A7F904112FFEDC145FBFE4EE99132DB07A62A0",
            "pub_key": {
              "type": "tendermint/PubKeyEd25519",
              "value": "QWecDY9BABcdmOXbAL3nXJECabxMdwrSaLcGmYcPkiQ="
            },
            "voting_power": "10",
            "accum": "-10"
          },
          {
            "address": "EBD14A46C34874F9B7824168ECEB49806460627E",
            "pub_key": {
              "type": "tendermint/PubKeyEd25519",
              "value": "MeWPgzs6xN4JI9/dmWPWjpmfK4TcagYMwCyPdtOISEg="
            },
            "voting_power": "10",
            "accum": "30"
          }
        ],
        "proposer": {
          "address": "A5A7F904112FFEDC145FBFE4EE99132DB07A62A0",
          "pub_key": {
            "type": "tendermint/PubKeyEd25519",
            "value": "QWecDY9BABcdmOXbAL3nXJECabxMdwrSaLcGmYcPkiQ="
          },
          "voting_power": "10",
          "accum": "-10"
        }
      },

this is a state where we have now unconfirmed_txs piling up and the proposer sitting there
restarting the proposer makes the chain go round again, until the same happens, for a different propose
[...]

melekes (@melekes:matrix.org)
[...]
not sure that I follow. srmo do you suspect we have a bug in our proposer selection algorithm?

srmo
I'm just throwing out things that look strange to me. Mostly because I lack a deeper understanding of what's going on.
Might be totally misguided, especially after reading "The validator max accum in each round is the proposer" which seems to be wrong according to the dump_consensus. We just see that it is stuck in proposol round (?) btu that means a proposer has been elected. Sometimes, it works when we restart the proposing node (so probably a different proposer is elected).
unconfirmed_tx' are processed.

[... REMOVED: a note about restarting the affected proposer doesn't help, as this might have been an oversight. Will be retested]

For what it鈥檚 worth, all with empty_blocks=false
I鈥檒l try and see, if our app is able to handle empty block creation and retest with that too ... -.-

Additionally, we introduced compression to our transactions, so they are no longer 10k+ (or even up to 80k) but more reasonable 4k-8k with possible spikes at 10k

Any news on this?

I was helping @xla investigate this yesterday.
We ran the following things on the docker localnet-start default testnet, except with create_empty_blocks = false, on both master and develop, and observed the same behavior.

There were 3 seperate bugs observed, only the 2nd bug is relevant to this issue.

  1. On doing tm-bench -v -T 10 192.167.10.2:26657, more than 10 blocks were created, implying that something is off with how its waiting to build blocks.
  2. Running
tm-bench -v -r 1000 -T 40 -s 80000 192.167.10.3:26657
<verbose output snipped>
D[07-19|17:20:30.634] 54 txs at block height 11                    
D[07-19|17:20:30.634] 15 txs at block height 10                    
D[07-19|17:20:30.634] 2 txs at block height 9                      
Stats          Avg       StdDev     Max     Total     
Txs/sec        2         9          54      71        
Blocks/sec     0.075     0.263      1       3 

caused the chain to halt. Each node would produce a ton of the following errors:

`node1    | I[07-19|17:11:22.361] Internal msg queue is full. Using a go-routine module=consensus` 

After a bout of those, there would then be a ton of these errors:

node2    | I[07-19|17:24:48.336] Received a block part when we're not expecting any module=consensus height=12 round=15 index=741 peer=b61582b8614b481743681f0d93d4dabefe529c5a

with the index number increasing between each message.

After the received block part messages ended, the round number would increment and the process would repeat. The number of unconfirmed transactions did not decrease. Flushing the mempool on the proposer made blocks begin to be produced again.

  1. Ending the cluster (via a single CTRL+C) and restarting it would cause no blocks to be produced due to a round regression error on each node. (Implies we're not writing the round correctly somewhere. It was starting from round 0)

We were able to reproduce this with a one-node setup. I'm currently investigating.
Usually, if you have a steady flow of transactions you'll see
COMMIT
RECHECK
INDEXED
ADDED (good transaction)

One case, where we got stuck was when the good tx was added before commit to the mempool.
Not sure if this is just misleading but I'm currently having a look at the logic surrounding the txsAvailable channel + notifiedTxsAvailable boolean and when they are set/read.

I'll see that I get debug logs to you for this single node setup

Oh yeah. Look at that:
I've added some logging around the channel
Here while it's working:

I[07-20|17:06:08.949] Executed block                               module=state height=155 validTxs=1 invalidTxs=0
I[07-20|17:06:08.950] Committed state                              module=state height=155 txs=1 appHash=E11101D0ED035BE392DD8AEB89FF320F8ADC0B8E
I[07-20|17:06:08.950] Recheck txs                                  module=mempool numtxs=0 height=155
I[07-20|17:06:08.951] Indexed block                                module=txindex height=155
I[07-20|17:06:09.043] WSJSONRPC                                    module=rpc-server protocol=websocket remote=127.0.0.1:65141 method=broadcast_tx_sync
I[07-20|17:06:09.043] Added good transaction                       module=mempool tx=6AD2D37DC1D130DC4F8B2FDA921841067D287F2A res="&{CheckTx:fee:<value:0 > }" total=1
I[07-20|17:06:09.043] writing txsAvailable with height             module=mempool height=156
I[07-20|17:06:09.043] tx available channel for height              module=consensus height=156
I[07-20|17:06:09.043] enterPropose(156/0). Current: 156/0/RoundStepNewHeight module=consensus height=156 round=0

You clearly see that the last height was finally commited, so the added tx was notified with the next heigth.

Now when it all stopped:

I[07-20|17:06:10.065] Executed block                               module=state height=157 validTxs=0 invalidTxs=0
I[07-20|17:06:10.065] Added good transaction                       module=mempool tx=99216F6CD1E273D3A18AA74D4360C383920FBA2F res="&{CheckTx:fee:<value:0 > }" total=1
I[07-20|17:06:10.065] writing txsAvailable with height             module=mempool height=157
I[07-20|17:06:10.065] WSJSONRPC                                    module=rpc-server protocol=websocket remote=127.0.0.1:65141 method=broadcast_tx_sync
I[07-20|17:06:10.066] Committed state                              module=state height=157 txs=0 appHash=7EB63D58748B07C861B67D87D384830F9BF6B971
I[07-20|17:06:10.066] Recheck txs                                  module=mempool numtxs=1 height=157
I[07-20|17:06:10.067] Done rechecking txs                          module=mempool 
I[07-20|17:06:10.067] Indexed block                                module=txindex height=157
I[07-20|17:06:10.067] tx available channel for height              module=consensus height=157
I[07-20|17:06:11.067] Timed out                                    module=consensus dur=995.766892ms height=158 round=0 step=RoundStepNewHeight
I[07-20|17:06:11.067] enterNewRound(158/0). Current: 158/0/RoundStepNewHeight module=consensus height=158 round=0
I[07-20|17:06:15.959] Ensure peers                                 module=p2p numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[07-20|17:06:15.959] No addresses to dial nor connected peers. Falling back to seeds module=p2p 
I[07-20|17:06:16.221] WSJSONRPC                                    module=rpc-server protocol=websocket remote=127.0.0.1:65141 method=status

Here you see that it was notified with the height that was currently in the process of being commited.

Strangely, re-check TX doesn't seem to have any logic to trigger a new proposal.
This is the one-node test case.
Still need to grab debug logs.

Here:
https://gist.github.com/srmo/80ad1dc716e6c0273fe2c25b7e026088

Happened gain, with the same scenario.
You will find
tx available channel for height - printed when case height := <-cs.mempool.TxsAvailable(): was triggered
and writing txsAvailable with heigh - printed when case mem.txsAvailable <- mem.height + 1: was triggered
in the log, which has been added by me for debugging purposes.

This was all done at commitish c64a3c74
Will retest with develop

Sidenote: wasn't aware, but in debug we see: D[07-20|17:12:46.288] enterPropose(229/0): Invalid args. Current step: 230/0/RoundStepNewHeight module=consensus height=229 round=0

PS: same happening in latest develop

I'm currently validating https://github.com/tendermint/tendermint/pull/2021 against our 4 validator node production test environment and seems as this is now "stable".
Would love if you guys can help pushing this PR ^^

Ok, I've run our 4 validator node setup with the fixed version (backported locally to 0.22.4) and it is as stable as can be. We are going to run harder stress tests but it looks like we can close this issue on short notice.
Thanks for the merge!

Awesome

Merged to develop. Will be shipped with 0.22.5 or next breaking release.

Would be great if we get this in 0.22.5 - if not, we can stay on my fork but also looking forward to your other changes/fixes.

Wow, thanks a lot for digging down and fixing this.

My pleasure :)

This issue can be closed. Looks like it's fixed so far.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ebuchman picture ebuchman  路  25Comments

melekes picture melekes  路  23Comments

ebuchman picture ebuchman  路  24Comments

bijlar picture bijlar  路  23Comments

danil-lashin picture danil-lashin  路  93Comments