Parity-ethereum: parity ethereum client doesn't always shutdown gracefully

Created on 15 Feb 2019  ·  17Comments  ·  Source: openethereum/parity-ethereum

  • Parity Ethereum version: 2.2.10-stable
  • Operating system: Linux
  • Installation: binary
  • Fully synchronized: yes
  • Network: ethereum
  • Restarted: yes

Sometimes stopping parity results in the issue described here: https://github.com/paritytech/parity-ethereum/issues/9101#issuecomment-454746413

Sometimes when stopping parity even with shutdown tracing turned on the process exits immediately and nothing about a shutdown is being logged at all.

So we have 3 outcomes when stopping parity:

  • clean shutdown (nothing being logged at all, almost instant)
  • clean shutdown (shutdown being logged, taking 1-10 seconds)
  • unclean shutdown (_Shutdown is taking longer than expected / Shutdown timeout reached, exiting uncleanly_)

How to debug this further?

F2-bug 🐞 M4-core ⛓ P5-sometimesoon 🌲

All 17 comments

2019-02-15 12:25:03  Verifier #1 INFO import  Imported #7223288 0xe31c…3dcf (84 txs, 7.99 Mgas, 675 ms, 14.60 KiB)
2019-02-15 12:25:08  IO Worker #3 INFO import     5/ 5 peers     18 MiB chain  115 MiB db  0 bytes queue   43 KiB sync  RPC:  0 conn,    0 req/s, 3244 µs
2019-02-15 12:25:23  Verifier #1 INFO import  Imported #7223289 0x70d4…8380 (103 txs, 7.99 Mgas, 1599 ms, 16.89 KiB)
2019-02-15 12:25:38  IO Worker #2 INFO import     5/ 5 peers     18 MiB chain  115 MiB db  0 bytes queue   43 KiB sync  RPC:  0 conn,    0 req/s, 3244 µs
2019-02-15 12:25:57  main INFO parity_ethereum::run  Finishing work, please wait...
2019-02-15 12:25:57  main TRACE shutdown  [IoService] Closing...
2019-02-15 12:25:57   TRACE shutdown  [IoWorker] Closing...
2019-02-15 12:25:57   TRACE shutdown  [IoWorker] Closed
2019-02-15 12:25:57   TRACE shutdown  [IoWorker] Closing...
2019-02-15 12:25:57   TRACE shutdown  [IoWorker] Closed
2019-02-15 12:25:57   TRACE shutdown  [IoWorker] Closing...
2019-02-15 12:25:57   TRACE shutdown  [IoWorker] Closed
2019-02-15 12:25:57   TRACE shutdown  [IoWorker] Closing...
2019-02-15 12:25:57   TRACE shutdown  [IoWorker] Closed
2019-02-15 12:25:57  main TRACE shutdown  [IoService] Closed.
2019-02-15 12:26:57  main WARN parity_ethereum::run  Shutdown is taking longer than expected.
2019-02-15 12:30:57  main WARN parity_ethereum::run  Shutdown timeout reached, exiting uncleanly.

People are still seeing this issue regularly in recent versions - unclean shutdowns are leading to many more reports of db corruption so bumping priority here 😥

i can confirm this happens regularly to at least 4 parity archive instances we run with the following config

--auto-update=none
--base-path=/paritydb
--mode=active
--tracing=on
--pruning=archive
--db-compaction=ssd
--scale-verifiers
--num-verifiers=6
--jsonrpc-server-threads=5
--jsonrpc-threads=5
--cache-size=22000
--min-peers=100
--max-peers=1000
--jsonrpc-hosts=all
--jsonrpc-interface=all
--ws-interface=all
--tx-queue-mem-limit=2048
--tx-queue-size=2000000

There is probably one more of these bugs to root out. I've seen this happen with --chain kovan when the snapshotting service is running. With some extra logging added it looks like this:

2019-06-12 14:04:10  main TRACE shutdown  [IoService] Closed.
2019-06-12 14:04:10  main TRACE shutdown  ClientService dropped
2019-06-12 14:04:10  main TRACE shutdown  RPC dropped
2019-06-12 14:04:10  main TRACE shutdown  KeepAlive dropped
2019-06-12 14:04:10  main TRACE shutdown  Informant shut down
2019-06-12 14:04:10  main TRACE shutdown  Informant dropped
2019-06-12 14:04:10  main TRACE shutdown  Client dropped
2019-06-12 14:04:10  main TRACE shutdown  Waiting for refs to Client to shutdown, strong_count=19, weak_count=Some(13)
2019-06-12 14:04:10  jsonrpc-eventloop-1 TRACE shutdown  [IoService] Closing...
2019-06-12 14:04:10   TRACE shutdown  [IoWorker] Closing...
2019-06-12 14:04:10   TRACE shutdown  [IoWorker] Closed
2019-06-12 14:04:10   TRACE shutdown  [IoWorker] Closing...
2019-06-12 14:04:10   TRACE shutdown  [IoWorker] Closed
2019-06-12 14:04:10   TRACE shutdown  [IoWorker] Closing...
2019-06-12 14:04:10   TRACE shutdown  [IoWorker] Closed
2019-06-12 14:04:10   TRACE shutdown  [IoWorker] Closing...
2019-06-12 14:04:10   TRACE shutdown  [IoWorker] Closed
2019-06-12 14:04:10  jsonrpc-eventloop-1 TRACE shutdown  [IoService] Closed.
2019-06-12 14:04:11  main TRACE shutdown  Waiting for client to drop, strong_count=2, weak_count=Some(5)
2019-06-12 14:04:12  main TRACE shutdown  Waiting for client to drop, strong_count=2, weak_count=Some(5)
…
2019-06-12 14:05:10  main WARN parity_ethereum::run  Shutdown is taking longer than expected.
…
2019-06-12 14:05:11  main TRACE shutdown  Waiting for client to drop, strong_count=2, weak_count=Some(5)
2019-06-12 14:05:12  main TRACE shutdown  Waiting for client to drop, strong_count=2, weak_count=Some(5)
2019-06-12 14:05:13  main TRACE shutdown  Waiting for client to drop, strong_count=2, weak_count=Some(5)
…
2019-06-12 14:09:10  main WARN parity_ethereum::run  Shutdown timeout reached, exiting uncleanly.

The problem is still present in current stable. When it will be merged?

@zet-tech it should be resolved in 2.4.8 and 2.5.3 - it was merged into those releases - if you still have issues with shutdowns the source of the issue may be different

It is present in 2.4.8 and the issue is for sure related to rpc. When I bind only to locahost and there is no rpc calls, restarts are correct. But when I bind parity do remote IP and it got request from our other software (even one second is enough which means 5-10 requests, only eth_getWork and eth_getBlockByNumber), restart is not possible and process is being killed. This result in DB corruption much more often then is should (even once per 10 restarts) and we were forced to move to GETH on production due to this problem.

One thing worth noting about any of these shutdown problems is that different bugs can cause the same symptom. We recently fixed one instance where shutdown would fail while the node was taking a snapshot.
RPC usage causing deadlock during shutdown is quite possibly a distinct bug.

@zet-tech That sounds really bad. I have tried to reproduce the shutdown problem after RPC on the latest master and could not see a problem. I'd need your assistance to debug this further.

  • if you have the possibility to try your setup with a master build that'd be great
  • can you share your configuration toml file with us so I can replicate your setup more closely?
  • I'm not sure what you mean by "bind only to locahost"/"when I bind parity do remote IP", can you elaborate?

Thanks!

Can anyone confirm this issue has been fixed ?

Can anyone confirm this issue has been fixed ?

As mentioned above there are possibly several other causes with the same symptom. We have fixed a few but there might be others. FWIW we have experienced or afaik not had reports of shutdown issues for several months now.

Can anyone confirm this issue has been fixed ?

As mentioned above there are possibly several other causes with the same symptom. We have fixed a few but there might be others. FWIW we have experienced or afaik not had reports of shutdown issues for several months now.

So, after my server automatically restarted this weekend, I can confirm that the parity server restart normally with pm2. No error.
Thx

I just installed 2.6.4. Problem still occurs, exactly as before.

Answering previous questions:
1). outdated
2).
eth.txt

I removed the IP address because it is public IP.

3). By bind only to localhost, I meant that if there is no RPC calls to parity then the error does not occur. But even one RPC call cause that parity cannot be shutdown.

2.5.10 solves our restart problem.

Sorry, forgot to mention that I didn't observe ungraceful shutdowns with v2.6.5, now running v2.6.6.

@zet-tech Don't forget to upgrade to at least v2.5.11 before Istanbul fork at the weekend: https://github.com/paritytech/parity-ethereum/releases/tag/v2.5.11

Already updated, but bug was fixed in 2.5.10.

pt., 6 gru 2019, 11:30 użytkownik @c0deright notifications@github.com
napisał:

Sorry, forgot to mention that I didn't observe ungraceful shutdowns with
v2.6.5, now running v2.6.6.

@zet-tech https://github.com/zet-tech Don't forget to upgrade to at
least v2.5.11 before Istanbul fork at the weekend:
https://github.com/paritytech/parity-ethereum/releases/tag/v2.5.11


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/paritytech/parity-ethereum/issues/10364?email_source=notifications&email_token=AF7ICSBNK7ZPNTCODWJQNPTQXISTPA5CNFSM4GXV5VYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEGDWAVI#issuecomment-562520149,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AF7ICSCVNBPC56T3PVVKWGTQXISTPANCNFSM4GXV5VYA
.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

vmenond picture vmenond  ·  3Comments

jordipainan picture jordipainan  ·  3Comments

retotrinkler picture retotrinkler  ·  3Comments

BillSantos picture BillSantos  ·  3Comments

famfamfam picture famfamfam  ·  3Comments