Parity-ethereum: Client doesn't shutdown gracefully on Ctrl-C

Created on 24 Oct 2018  ·  12Comments  ·  Source: openethereum/parity-ethereum

  • Parity Ethereum version: master fe84718b5
  • Operating system: Linux
  • Installation: built from source
  • Fully synchronized: yes, but shouldn't be relevant
  • Network: kovan, but shouldn't be relevant
  • Restarted: yes

Works fine with b8da38f4e, so it was introduced with #9657:

^C2018-10-24 15:45:38  main WARN parity_ethereum::run  Shutdown is taking longer than expected.
2018-10-24 15:49:38  main WARN parity_ethereum::run  Shutdown timeout reached, exiting uncleanly.
F2-bug 🐞 M4-core ⛓ P2-asap 🌊

All 12 comments

A 15 minute shutdown:

2018-10-26 16:33:08  Imported #6589794 0x146d…b8c1 (59 txs, 3.10 Mgas, 1181 ms, 13.09 KiB) + another 1 block(s) containing 26 tx(s)
2018-10-26 16:33:22  Imported #6589795 0x0a86…59e8 (86 txs, 7.96 Mgas, 5502 ms, 14.35 KiB)
2018-10-26 16:33:48  Imported #6589798 0xf339…b808 (29 txs, 7.98 Mgas, 6286 ms, 5.22 KiB) + another 2 block(s) containing 102 tx(s)
2018-10-26 16:34:02  Imported #6589799 0xabd6…391a (5 txs, 7.95 Mgas, 130 ms, 13.99 KiB)
2018-10-26 16:34:12     8/25 peers   216 MiB chain 27 MiB db 0 bytes queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:34:12     8/25 peers   216 MiB chain 27 MiB db 0 bytes queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:36:01  Syncing #6589803 0x9613…e93a     0.04 blk/s    0.8 tx/s    0 Mgas/s      0+    8 Qed  #6589811   19/25 peers   217 MiB chain 27 MiB db 371 KiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:36:12  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+    9 Qed  #6589812   21/25 peers   217 MiB chain 27 MiB db 392 KiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
^C2018-10-26 16:36:13  Finishing work, please wait...
2018-10-26 16:36:39  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   11 Qed  #6589813   15/25 peers   217 MiB chain 27 MiB db 535 KiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:37:55  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   17 Qed  #6589819   19/25 peers   219 MiB chain 27 MiB db 858 KiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:40:55  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   31 Qed  #6589831   24/25 peers   225 MiB chain 27 MiB db 1 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:42:42  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   36 Qed  #6589836   26/50 peers   226 MiB chain 27 MiB db 2 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
^C
2018-10-26 16:44:05  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   46 Qed  #6589845   25/25 peers   227 MiB chain 27 MiB db 2 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:45:12  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   51 Qed  #6589850   25/25 peers   229 MiB chain 27 MiB db 3 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:46:28  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   60 Qed  #6589857   25/25 peers   230 MiB chain 27 MiB db 3 MiB queue 252 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:47:41  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   71 Qed  #6589865   25/25 peers   231 MiB chain 27 MiB db 4 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:48:30  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   73 Qed  #6589867   24/25 peers   231 MiB chain 27 MiB db 4 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:49:20  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   79 Qed  #6589873   25/25 peers   232 MiB chain 27 MiB db 4 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
^C^C^C
2018-10-26 16:49:53  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   83 Qed  #6589875   24/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:17  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   86 Qed  #6589878   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:28  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   86 Qed  #6589878   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:34  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   86 Qed  #6589878   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:41  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   87 Qed  #6589879   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:47  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   88 Qed  #6589879   26/50 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:50:54  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   88 Qed  #6589879   26/50 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:00  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   90 Qed  #6589880   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:06  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   90 Qed  #6589880   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:12  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   92 Qed  #6589882   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:19  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   93 Qed  #6589883   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:25  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   93 Qed  #6589883   25/25 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:32  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   93 Qed  #6589883   26/50 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
2018-10-26 16:51:38  Syncing #6589803 0x9613…e93a     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+   93 Qed  #6589883   26/50 peers   232 MiB chain 27 MiB db 5 MiB queue 251 KiB sync  RPC:  0 conn,    0 req/s,    0 µs
<shutdown>

This appears to be an Arc<Client> that isn't being dropped somewhere. #9657 must have uncovered a pre-existing circular reference that probably was dropped before with the old reactor but isn't now with the new Runtime. I'll look more deeply into this as soon as I can.

Bump priority here. This makes our Gitlab pipelines fail:

2018-11-02 01:40:38 UTC Starting Parity-Ethereum/v2.3.0-unstable-3a6e04b-20181101/x86_64-linux-gnu/rustc1.30.0
2018-11-02 01:40:38 UTC Keys path /root/.local/share/io.parity.ethereum/keys/ethereum
2018-11-02 01:40:38 UTC DB path /root/.local/share/io.parity.ethereum/chains/ethereum/db/906a34e69aec8c0d
2018-11-02 01:40:38 UTC State DB configuration: fast
2018-11-02 01:40:38 UTC Operating mode: active
2018-11-02 01:40:39 UTC Configured for Foundation using Ethash engine
2018-11-02 01:40:43 UTC Updated conversion rate to Ξ1 = US$200.12 (23795246 wei/gas)
2018-11-02 01:40:48 UTC Finishing work, please wait...
2018-11-02 01:40:48 UTC Syncing       #0 0xd4e5…8fa3     0.00 blk/s    0.0 tx/s    0 Mgas/s      0+    0 Qed        #0    4/25 peers      8 KiB chain    3 MiB db  0 bytes queue   11 KiB sync  RPC:  0 conn,    0 req/s,  203 µs
2018-11-02 01:40:51 UTC Public node URL: enode://3f0efae32a2999cbdfeff9aeab52db616cd8432d22a9181d6ef7221c8ae405a06ee892d7ad8eb6bc716a635890ef2ec9e2e5746d6d1281f2b51b518c82441d72@172.17.0.3:30303
2018-11-02 01:41:48 UTC Shutdown is taking longer than expected.
2018-11-02 01:45:48 UTC Shutdown timeout reached, exiting uncleanly.
pure virtual method called
terminate called without an active exception
RPC output: {"jsonrpc":"2.0","result":{"hash":"0x3a6e04ba150d4bc2badb707cdc0efdfc5159aef6","track":"null","version":{"major":2,"minor":3,"patch":0}},"id":1}Aborted (core dumped)
ERROR: Job failed: exit code 1

https://gitlab.parity.io/parity/parity-ethereum/-/jobs/107223

Can you look at this @c0gent ?

I'll work on it today

Somewhere, somehow, an Arc<Client> isn't being dropped as a result of the change from the old tokio-core reactor + futures-cpupool CpuPool to the new tokio Runtime (an all-in-one reactor + pool).

I've spent a fair bit of time trying to figure out where and why this Arc<Client> isn't being dropped the same way it was before. I'm coming up with no clear answers.

There must be someone else with more intuition than I about where client references might be hiding (behind trait objects for example). Perhaps that person could help shine some light on this subject.

I'll keep investigating regardless...

This has happened once before and it took ages to track it down :(
I have no intuition or advice to give other than some word of encouragement and thanks for even trying.

Hopefully someone else (whoever fixed it last time) might be able to show up with some better advice.

Maybe run memcheck on the binary? http://valgrind.org/docs/manual/mc-manual.html#mc-manual.leaks

This will only help if there's a reference cycle somewhere, not if it's being held on the stack.

Sure, but I found yesterday that we have 24 strong references to the Client by just naively print it out from the Arc. Thus, it never gets dropped!

Sure, but I found yesterday that we have 24 strong references to the Client by just naively print it out from the Arc. Thus, it never gets dropped!

I found the same, I guess I should have pointed it out here earlier. I am currently trying to debug this with a tracking-Arc implementation, if someone wants to chip in, let me know.

For those interested, here's a draft: https://github.com/mbr/snarc-rs

I'm in the progress of upgrading my 2.2.3-beta test-node to 2.2.5-beta and upon stopping 2.2.3-beta it get's stuck with the same message:

2018-12-17 17:44:26  Verifier #1 INFO import  Imported #6904209 0x5f0c…05c8 (126 txs, 7.99 Mgas, 682 ms, 26.00 KiB)
2018-12-17 17:44:30  IO Worker #0 INFO import    11/15 peers    140 MiB chain  401 MiB db  0 bytes queue  250 KiB sync  RPC:  0 conn,    0 req/s,   18 µs
2018-12-17 17:44:44  main INFO parity_ethereum::run  Finishing work, please wait...
2018-12-17 17:45:44  main WARN parity_ethereum::run  Shutdown is taking longer than expected.
2018-12-17 17:49:44  main WARN parity_ethereum::run  Shutdown timeout reached, exiting uncleanly.

Running as daemon here, logging to logfile.

Starting 2.2.5-beta afterwards didn't log anything unusual.

Was this page helpful?
0 / 5 - 0 ratings