BUG REPORT
Tendermint version
0.22.0-fa05b039
ABCI app
Custom Internal
Environment:
What happened:
After a not deterministic time period transactions which were successfully validated by CheckTx are not getting gossiped or deleted from mempool and the node cant be used anymore.
What you expected to happen:
After a tx is validated by CheckTx it should be gossiped to other nodes and the proposer node calls BeginBlock.
How to reproduce it:
We can reproduce this consistently without a deterministic time period with the setup described above
Config:
empty blocks=false
p2p-params as described in "running in production"
/dump_consensus_state output for consensus bugs
consensus node 1:
https://gist.github.com/yuomii/5acbd97d2935b7d1dd61d98e2ce5adb8
consensus node 2:
https://gist.github.com/yuomii/ab11b2914753080b39cd3923b07f5d6d
consensus node 3:
https://gist.github.com/yuomii/23f74831f9031b213186c00d83fd35eb
consensus node 4:
https://gist.github.com/yuomii/215862af5e899092987b41852e13018c
Logs:
Full Logs (log level debug):
node 1:
https://gist.github.com/yuomii/918ed5271a909613b37649f1b8d152f9
node 2:
https://gist.github.com/yuomii/d80a783cdb088039e1cc78689917c697
node 3:
https://gist.github.com/yuomii/96b0e482d697d46e27648d3e73c36ae8
node 4:
https://gist.github.com/yuomii/2082b674c996386499144c819785a164
Seems to be related to #1875
EDIT:
dump_consensus was called 2h later. After ~2h we noticed that blocks were beeing build again.
We can reproduce this consistently without a deterministic time period with the setup described above
Can you clarify with some details around how many txs you are sending and to which nodes and at what rate?
Also it seems some of the logs are out of order? Height and time flow backwards at points. Is this raw output from Tendermint?
This is produced by a single testcase setup in an infinite loop. The test has different payloads 3 very small ones and two big ones around EDIT: ~80KB. The rate is around 1tx per 2 seconds. I cant tell you exactly which payload is send to which node because in our application we check if the current node we are broadcasting to is in syncing state. If thats the case we switch the node we brodcast to on the fly.
This is raw output from tendermint
i replaced (find/replace) the real node IPs in the log files because of NDA and stuff. If thats a problem for you we need to find a way to get the log files to you over a private channel
I recheched the logs and picked the biggest tx. The big ones are actually up to ~80KB
any updates on this?
Currently retesting with lower tx size:
Proposer stuck in step 2
Other nodes stuck in step 1
4 Nodes.
4 Validators.
Debug logs will follow. had commented on https://github.com/tendermint/tendermint/issues/1961 but deleting the comments from there, as we don't see the "insufficient voting power" message.
When we restart the stuck proposer, the chain progresses and processes the unconfirmed txs, until it get's stuck again at proposal.
Currently staring at
// IncrementAccum increments accum of each validator and updates the
// proposer. Panics if validator set is empty.
func (valSet *ValidatorSet) IncrementAccum(times int) {
// Add VotingPower * times to each validator and order into heap.
validatorsHeap := cmn.NewHeap()
for _, val := range valSet.Validators {
// check for overflow both multiplication and sum
val.Accum = safeAddClip(val.Accum, safeMulClip(val.VotingPower, int64(times)))
validatorsHeap.PushComparable(val, accumComparable{val})
}
// Decrement the validator with most accum times times
for i := 0; i < times; i++ {
mostest := validatorsHeap.Peek().(*Validator)
// mind underflow
mostest.Accum = safeSubClip(mostest.Accum, valSet.TotalVotingPower())
if i == times-1 {
valSet.Proposer = mostest
} else {
validatorsHeap.Update(mostest, accumComparable{mostest})
}
}
}
in validator_set.go (0.22.4 tag) but according to melekes, negative accum is no issue (while we have no clue what it actually is).
@melekes as discussed, here the text from riot and the logfiles
Note: here's a selection from the textwall of riot
srmo
what is the meaning of "accum" on a Validator? What does it mean, if it is negative?
Asking in the light of https://github.com/tendermint/tendermint/issues/1961 and looking at validator_set.go mostest.Accum = safeSubClip(mostest.Accum, valSet.TotalVotingPower())
so for example a validator with acumm 20 and totalVotingPower 40 gets -20 as accum.
melekes (@melekes:matrix.org)
it's used for proposer selection
https://github.com/tendermint/tendermint/blob/develop/docs/spec/reactors/consensus/proposer-selection.md
negative is fine
srmo
Hm...still not really understanding how accum plays into this. Especially when seeing multiple validators with negative accum. Currently just desparate why the proposer gets stuck, but after restarting the proposer, the chain is able to progress.
Zaki
The validator max accum in each round is the proposer
srmo
Hm, this seems to be wrong, as often we find that a proposer with negative accum is chosen
@Zaki
While at least one other validator has a positive accum
srmo
"validators": [
{
"address": "230B52B0BE5A7F0A6F5AAD59876FA500729AAB2D",
"pub_key": {
"type": "tendermint/PubKeyEd25519",
"value": "fggKTpLEA7aK9Nt0EGTwLj5R7AvVNZOWfadkWP1tMQo="
},
"voting_power": "10",
"accum": "-10"
},
{
"address": "723E9664EB47538941DA80C1AADC7786FC48B392",
"pub_key": {
"type": "tendermint/PubKeyEd25519",
"value": "TvxexDbck6A3Ds2dLxuI2XlCigBUJuTZQDaSkElX37c="
},
"voting_power": "10",
"accum": "-10"
},
{
"address": "A5A7F904112FFEDC145FBFE4EE99132DB07A62A0",
"pub_key": {
"type": "tendermint/PubKeyEd25519",
"value": "QWecDY9BABcdmOXbAL3nXJECabxMdwrSaLcGmYcPkiQ="
},
"voting_power": "10",
"accum": "-10"
},
{
"address": "EBD14A46C34874F9B7824168ECEB49806460627E",
"pub_key": {
"type": "tendermint/PubKeyEd25519",
"value": "MeWPgzs6xN4JI9/dmWPWjpmfK4TcagYMwCyPdtOISEg="
},
"voting_power": "10",
"accum": "30"
}
],
"proposer": {
"address": "A5A7F904112FFEDC145FBFE4EE99132DB07A62A0",
"pub_key": {
"type": "tendermint/PubKeyEd25519",
"value": "QWecDY9BABcdmOXbAL3nXJECabxMdwrSaLcGmYcPkiQ="
},
"voting_power": "10",
"accum": "-10"
}
},
this is a state where we have now unconfirmed_txs piling up and the proposer sitting there
restarting the proposer makes the chain go round again, until the same happens, for a different propose
[...]
melekes (@melekes:matrix.org)
[...]
not sure that I follow. srmo do you suspect we have a bug in our proposer selection algorithm?
srmo
I'm just throwing out things that look strange to me. Mostly because I lack a deeper understanding of what's going on.
Might be totally misguided, especially after reading "The validator max accum in each round is the proposer" which seems to be wrong according to the dump_consensus. We just see that it is stuck in proposol round (?) btu that means a proposer has been elected. Sometimes, it works when we restart the proposing node (so probably a different proposer is elected).
unconfirmed_tx' are processed.
[... REMOVED: a note about restarting the affected proposer doesn't help, as this might have been an oversight. Will be retested]
For what it鈥檚 worth, all with empty_blocks=false
I鈥檒l try and see, if our app is able to handle empty block creation and retest with that too ... -.-
Additionally, we introduced compression to our transactions, so they are no longer 10k+ (or even up to 80k) but more reasonable 4k-8k with possible spikes at 10k
Any news on this?
I was helping @xla investigate this yesterday.
We ran the following things on the docker localnet-start default testnet, except with create_empty_blocks = false, on both master and develop, and observed the same behavior.
There were 3 seperate bugs observed, only the 2nd bug is relevant to this issue.
tm-bench -v -T 10 192.167.10.2:26657, more than 10 blocks were created, implying that something is off with how its waiting to build blocks.tm-bench -v -r 1000 -T 40 -s 80000 192.167.10.3:26657
<verbose output snipped>
D[07-19|17:20:30.634] 54 txs at block height 11
D[07-19|17:20:30.634] 15 txs at block height 10
D[07-19|17:20:30.634] 2 txs at block height 9
Stats Avg StdDev Max Total
Txs/sec 2 9 54 71
Blocks/sec 0.075 0.263 1 3
caused the chain to halt. Each node would produce a ton of the following errors:
`node1 | I[07-19|17:11:22.361] Internal msg queue is full. Using a go-routine module=consensus`
After a bout of those, there would then be a ton of these errors:
node2 | I[07-19|17:24:48.336] Received a block part when we're not expecting any module=consensus height=12 round=15 index=741 peer=b61582b8614b481743681f0d93d4dabefe529c5a
with the index number increasing between each message.
After the received block part messages ended, the round number would increment and the process would repeat. The number of unconfirmed transactions did not decrease. Flushing the mempool on the proposer made blocks begin to be produced again.
We were able to reproduce this with a one-node setup. I'm currently investigating.
Usually, if you have a steady flow of transactions you'll see
COMMIT
RECHECK
INDEXED
ADDED (good transaction)
One case, where we got stuck was when the good tx was added before commit to the mempool.
Not sure if this is just misleading but I'm currently having a look at the logic surrounding the txsAvailable channel + notifiedTxsAvailable boolean and when they are set/read.
I'll see that I get debug logs to you for this single node setup
Oh yeah. Look at that:
I've added some logging around the channel
Here while it's working:
I[07-20|17:06:08.949] Executed block module=state height=155 validTxs=1 invalidTxs=0
I[07-20|17:06:08.950] Committed state module=state height=155 txs=1 appHash=E11101D0ED035BE392DD8AEB89FF320F8ADC0B8E
I[07-20|17:06:08.950] Recheck txs module=mempool numtxs=0 height=155
I[07-20|17:06:08.951] Indexed block module=txindex height=155
I[07-20|17:06:09.043] WSJSONRPC module=rpc-server protocol=websocket remote=127.0.0.1:65141 method=broadcast_tx_sync
I[07-20|17:06:09.043] Added good transaction module=mempool tx=6AD2D37DC1D130DC4F8B2FDA921841067D287F2A res="&{CheckTx:fee:<value:0 > }" total=1
I[07-20|17:06:09.043] writing txsAvailable with height module=mempool height=156
I[07-20|17:06:09.043] tx available channel for height module=consensus height=156
I[07-20|17:06:09.043] enterPropose(156/0). Current: 156/0/RoundStepNewHeight module=consensus height=156 round=0
You clearly see that the last height was finally commited, so the added tx was notified with the next heigth.
Now when it all stopped:
I[07-20|17:06:10.065] Executed block module=state height=157 validTxs=0 invalidTxs=0
I[07-20|17:06:10.065] Added good transaction module=mempool tx=99216F6CD1E273D3A18AA74D4360C383920FBA2F res="&{CheckTx:fee:<value:0 > }" total=1
I[07-20|17:06:10.065] writing txsAvailable with height module=mempool height=157
I[07-20|17:06:10.065] WSJSONRPC module=rpc-server protocol=websocket remote=127.0.0.1:65141 method=broadcast_tx_sync
I[07-20|17:06:10.066] Committed state module=state height=157 txs=0 appHash=7EB63D58748B07C861B67D87D384830F9BF6B971
I[07-20|17:06:10.066] Recheck txs module=mempool numtxs=1 height=157
I[07-20|17:06:10.067] Done rechecking txs module=mempool
I[07-20|17:06:10.067] Indexed block module=txindex height=157
I[07-20|17:06:10.067] tx available channel for height module=consensus height=157
I[07-20|17:06:11.067] Timed out module=consensus dur=995.766892ms height=158 round=0 step=RoundStepNewHeight
I[07-20|17:06:11.067] enterNewRound(158/0). Current: 158/0/RoundStepNewHeight module=consensus height=158 round=0
I[07-20|17:06:15.959] Ensure peers module=p2p numOutPeers=0 numInPeers=0 numDialing=0 numToDial=10
I[07-20|17:06:15.959] No addresses to dial nor connected peers. Falling back to seeds module=p2p
I[07-20|17:06:16.221] WSJSONRPC module=rpc-server protocol=websocket remote=127.0.0.1:65141 method=status
Here you see that it was notified with the height that was currently in the process of being commited.
Strangely, re-check TX doesn't seem to have any logic to trigger a new proposal.
This is the one-node test case.
Still need to grab debug logs.
Here:
https://gist.github.com/srmo/80ad1dc716e6c0273fe2c25b7e026088
Happened gain, with the same scenario.
You will find
tx available channel for height - printed when case height := <-cs.mempool.TxsAvailable(): was triggered
and writing txsAvailable with heigh - printed when case mem.txsAvailable <- mem.height + 1: was triggered
in the log, which has been added by me for debugging purposes.
This was all done at commitish c64a3c74
Will retest with develop
Sidenote: wasn't aware, but in debug we see: D[07-20|17:12:46.288] enterPropose(229/0): Invalid args. Current step: 230/0/RoundStepNewHeight module=consensus height=229 round=0
PS: same happening in latest develop
I'm currently validating https://github.com/tendermint/tendermint/pull/2021 against our 4 validator node production test environment and seems as this is now "stable".
Would love if you guys can help pushing this PR ^^
Ok, I've run our 4 validator node setup with the fixed version (backported locally to 0.22.4) and it is as stable as can be. We are going to run harder stress tests but it looks like we can close this issue on short notice.
Thanks for the merge!
Awesome
Merged to develop. Will be shipped with 0.22.5 or next breaking release.
Would be great if we get this in 0.22.5 - if not, we can stay on my fork but also looking forward to your other changes/fixes.
Wow, thanks a lot for digging down and fixing this.
My pleasure :)
This issue can be closed. Looks like it's fixed so far.
Most helpful comment
Merged to develop. Will be shipped with 0.22.5 or next breaking release.