Sometimes stopping parity results in the issue described here: https://github.com/paritytech/parity-ethereum/issues/9101#issuecomment-454746413
Sometimes when stopping parity even with shutdown tracing turned on the process exits immediately and nothing about a shutdown is being logged at all.
So we have 3 outcomes when stopping parity:
How to debug this further?
2019-02-15 12:25:03 Verifier #1 INFO import Imported #7223288 0xe31c…3dcf (84 txs, 7.99 Mgas, 675 ms, 14.60 KiB)
2019-02-15 12:25:08 IO Worker #3 INFO import 5/ 5 peers 18 MiB chain 115 MiB db 0 bytes queue 43 KiB sync RPC: 0 conn, 0 req/s, 3244 µs
2019-02-15 12:25:23 Verifier #1 INFO import Imported #7223289 0x70d4…8380 (103 txs, 7.99 Mgas, 1599 ms, 16.89 KiB)
2019-02-15 12:25:38 IO Worker #2 INFO import 5/ 5 peers 18 MiB chain 115 MiB db 0 bytes queue 43 KiB sync RPC: 0 conn, 0 req/s, 3244 µs
2019-02-15 12:25:57 main INFO parity_ethereum::run Finishing work, please wait...
2019-02-15 12:25:57 main TRACE shutdown [IoService] Closing...
2019-02-15 12:25:57 TRACE shutdown [IoWorker] Closing...
2019-02-15 12:25:57 TRACE shutdown [IoWorker] Closed
2019-02-15 12:25:57 TRACE shutdown [IoWorker] Closing...
2019-02-15 12:25:57 TRACE shutdown [IoWorker] Closed
2019-02-15 12:25:57 TRACE shutdown [IoWorker] Closing...
2019-02-15 12:25:57 TRACE shutdown [IoWorker] Closed
2019-02-15 12:25:57 TRACE shutdown [IoWorker] Closing...
2019-02-15 12:25:57 TRACE shutdown [IoWorker] Closed
2019-02-15 12:25:57 main TRACE shutdown [IoService] Closed.
2019-02-15 12:26:57 main WARN parity_ethereum::run Shutdown is taking longer than expected.
2019-02-15 12:30:57 main WARN parity_ethereum::run Shutdown timeout reached, exiting uncleanly.
People are still seeing this issue regularly in recent versions - unclean shutdowns are leading to many more reports of db corruption so bumping priority here 😥
i can confirm this happens regularly to at least 4 parity archive instances we run with the following config
--auto-update=none
--base-path=/paritydb
--mode=active
--tracing=on
--pruning=archive
--db-compaction=ssd
--scale-verifiers
--num-verifiers=6
--jsonrpc-server-threads=5
--jsonrpc-threads=5
--cache-size=22000
--min-peers=100
--max-peers=1000
--jsonrpc-hosts=all
--jsonrpc-interface=all
--ws-interface=all
--tx-queue-mem-limit=2048
--tx-queue-size=2000000
This is probably fixed by https://github.com/paritytech/parity-ethereum/pull/10689
There is probably one more of these bugs to root out. I've seen this happen with --chain kovan when the snapshotting service is running. With some extra logging added it looks like this:
2019-06-12 14:04:10 main TRACE shutdown [IoService] Closed.
2019-06-12 14:04:10 main TRACE shutdown ClientService dropped
2019-06-12 14:04:10 main TRACE shutdown RPC dropped
2019-06-12 14:04:10 main TRACE shutdown KeepAlive dropped
2019-06-12 14:04:10 main TRACE shutdown Informant shut down
2019-06-12 14:04:10 main TRACE shutdown Informant dropped
2019-06-12 14:04:10 main TRACE shutdown Client dropped
2019-06-12 14:04:10 main TRACE shutdown Waiting for refs to Client to shutdown, strong_count=19, weak_count=Some(13)
2019-06-12 14:04:10 jsonrpc-eventloop-1 TRACE shutdown [IoService] Closing...
2019-06-12 14:04:10 TRACE shutdown [IoWorker] Closing...
2019-06-12 14:04:10 TRACE shutdown [IoWorker] Closed
2019-06-12 14:04:10 TRACE shutdown [IoWorker] Closing...
2019-06-12 14:04:10 TRACE shutdown [IoWorker] Closed
2019-06-12 14:04:10 TRACE shutdown [IoWorker] Closing...
2019-06-12 14:04:10 TRACE shutdown [IoWorker] Closed
2019-06-12 14:04:10 TRACE shutdown [IoWorker] Closing...
2019-06-12 14:04:10 TRACE shutdown [IoWorker] Closed
2019-06-12 14:04:10 jsonrpc-eventloop-1 TRACE shutdown [IoService] Closed.
2019-06-12 14:04:11 main TRACE shutdown Waiting for client to drop, strong_count=2, weak_count=Some(5)
2019-06-12 14:04:12 main TRACE shutdown Waiting for client to drop, strong_count=2, weak_count=Some(5)
…
2019-06-12 14:05:10 main WARN parity_ethereum::run Shutdown is taking longer than expected.
…
2019-06-12 14:05:11 main TRACE shutdown Waiting for client to drop, strong_count=2, weak_count=Some(5)
2019-06-12 14:05:12 main TRACE shutdown Waiting for client to drop, strong_count=2, weak_count=Some(5)
2019-06-12 14:05:13 main TRACE shutdown Waiting for client to drop, strong_count=2, weak_count=Some(5)
…
2019-06-12 14:09:10 main WARN parity_ethereum::run Shutdown timeout reached, exiting uncleanly.
The problem is still present in current stable. When it will be merged?
@zet-tech it should be resolved in 2.4.8 and 2.5.3 - it was merged into those releases - if you still have issues with shutdowns the source of the issue may be different
It is present in 2.4.8 and the issue is for sure related to rpc. When I bind only to locahost and there is no rpc calls, restarts are correct. But when I bind parity do remote IP and it got request from our other software (even one second is enough which means 5-10 requests, only eth_getWork and eth_getBlockByNumber), restart is not possible and process is being killed. This result in DB corruption much more often then is should (even once per 10 restarts) and we were forced to move to GETH on production due to this problem.
One thing worth noting about any of these shutdown problems is that different bugs can cause the same symptom. We recently fixed one instance where shutdown would fail while the node was taking a snapshot.
RPC usage causing deadlock during shutdown is quite possibly a distinct bug.
@zet-tech That sounds really bad. I have tried to reproduce the shutdown problem after RPC on the latest master and could not see a problem. I'd need your assistance to debug this further.
Thanks!
Can anyone confirm this issue has been fixed ?
Can anyone confirm this issue has been fixed ?
As mentioned above there are possibly several other causes with the same symptom. We have fixed a few but there might be others. FWIW we have experienced or afaik not had reports of shutdown issues for several months now.
Can anyone confirm this issue has been fixed ?
As mentioned above there are possibly several other causes with the same symptom. We have fixed a few but there might be others. FWIW we have experienced or afaik not had reports of shutdown issues for several months now.
So, after my server automatically restarted this weekend, I can confirm that the parity server restart normally with pm2. No error.
Thx
I just installed 2.6.4. Problem still occurs, exactly as before.
Answering previous questions:
1). outdated
2).
eth.txt
I removed the IP address because it is public IP.
3). By bind only to localhost, I meant that if there is no RPC calls to parity then the error does not occur. But even one RPC call cause that parity cannot be shutdown.
2.5.10 solves our restart problem.
Sorry, forgot to mention that I didn't observe ungraceful shutdowns with v2.6.5, now running v2.6.6.
@zet-tech Don't forget to upgrade to at least v2.5.11 before Istanbul fork at the weekend: https://github.com/paritytech/parity-ethereum/releases/tag/v2.5.11
Already updated, but bug was fixed in 2.5.10.
pt., 6 gru 2019, 11:30 użytkownik @c0deright notifications@github.com
napisał:
Sorry, forgot to mention that I didn't observe ungraceful shutdowns with
v2.6.5, now running v2.6.6.@zet-tech https://github.com/zet-tech Don't forget to upgrade to at
least v2.5.11 before Istanbul fork at the weekend:
https://github.com/paritytech/parity-ethereum/releases/tag/v2.5.11—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/paritytech/parity-ethereum/issues/10364?email_source=notifications&email_token=AF7ICSBNK7ZPNTCODWJQNPTQXISTPA5CNFSM4GXV5VYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGDWAVI#issuecomment-562520149,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AF7ICSCVNBPC56T3PVVKWGTQXISTPANCNFSM4GXV5VYA
.