Parity-ethereum: Block import fails while syncing

Created on 16 Oct 2016 · 13Comments · Source: openethereum/parity-ethereum

During a fresh sync from scratch (i.e. delete 906a34e69aec8c0d directory before starting), the latest release crashes as follows.

Notes,

This has happened multiple times, but with different block numbers below 2 million
I've done many syncs with previous versions which sailed past this point
I had not halted Parity, nor had there been any OOM error during this run - it was a straight run from scratch. It was not resource constrained.

Parity version (was pulled from https://hub.docker.com/r/ethcore/parity/builds/)

# docker run ethcore/parity --version
Parity
  version Parity/v1.3.8-beta-e0778fc-20161015/x86_64-linux-gnu/rustc1.12.0

Command line:

/build/parity/target/release/parity --db-compaction hdd --cache-size-db 512 --pruning fast --pruning-history 512 --no-dapps --no-ipc

Console output:

2016-10-16 00:54:57 UTC Syncing #1834040 1a3a…e292    122 blk/s  535 tx/s  23 Mgas/s    1246+29946 Qed   #1865252    1/22/25 peers     28 MiB db    6 MiB chain   25 MiB queue    9 MiB sync
2016-10-16 00:55:02 UTC Syncing #1834497 3809…eaa7     89 blk/s  652 tx/s  18 Mgas/s     328+30416 Qed   #1865252    0/24/25 peers     31 MiB db    6 MiB chain   24 MiB queue    9 MiB sync
2016-10-16 00:55:05 UTC Block import failed for #1834765 (4129…2bbd)
Error: Trie(IncompleteDatabase(ae9043ce5f6478392a06ca613ab9fe32936ef3c46a9d86d7b2b34715c3119512))
thread 'IO Worker #2' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore/result.rs:788
stack backtrace:
thread 'IO Worker #1' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore/result.rs:788
   1:     0x7f1de0da08c9 - <unknown>
   2:     0x7f1de0da859c - <unknown>
   3:     0x7f1de0da7469 - <unknown>
   4:     0x7f1de0da7b58 - <unknown>
   5:     0x7f1de0da79b2 - <unknown>
   6:     0x7f1de0da7920 - <unknown>
   7:     0x7f1de0da78a1 - <unknown>
   8:     0x7f1de0de485f - <unknown>
   9:     0x7f1de08a1708 - <unknown>
  10:     0x7f1de094d1ed - <unknown>
  11:     0x7f1de084a1e5 - <unknown>
  12:     0x7f1de0db0086 - <unknown>
  13:     0x7f1de08b320e - <unknown>
  14:     0x7f1de0da6102 - <unknown>
  15:     0x7f1ddf4c6183 - start_thread
  16:     0x7f1ddfee237c - clone
  17:                0x0 - <unknown>
stack backtrace:
   1:     0x7f1de0da08c9 - <unknown>
   2:     0x7f1de0da859c - <unknown>
   3:     0x7f1de0da7469 - <unknown>
   4:     0x7f1de0da7b58 - <unknown>
   5:     0x7f1de0da79b2 - <unknown>
   6:     0x7f1de0da7920 - <unknown>
   7:     0x7f1de0da78a1 - <unknown>
   8:     0x7f1de0de485f - <unknown>
   9:     0x7f1de08a1708 - <unknown>
  10:     0x7f1de094d1ed - <unknown>
  11:     0x7f1de084a1e5 - <unknown>
  12:     0x7f1de0db0086 - <unknown>
  13:     0x7f1de08b320e - <unknown>
  14:     0x7f1de0da6102 - <unknown>
  15:     0x7f1ddf4c6183 - start_thread
  16:     0x7f1ddfee237c - clone
  17:                0x0 - <unknown>
thread 'IO Worker #3' panicked at 'Expected in-chain blocks.', ../src/libcore/option.rs:700
stack backtrace:
   1:     0x7f1de0da08c9 - <unknown>
   2:     0x7f1de0da859c - <unknown>
   3:     0x7f1de0da7469 - <unknown>
   4:     0x7f1de0da7b58 - <unknown>
   5:     0x7f1de0da79b2 - <unknown>
   6:     0x7f1de0da7920 - <unknown>
   7:     0x7f1de0da78a1 - <unknown>
   8:     0x7f1de0de485f - <unknown>
   9:     0x7f1de0de48d5 - <unknown>
  10:     0x7f1de099c85e - <unknown>
  11:     0x7f1de08ce73f - <unknown>
  12:     0x7f1de099c64b - <unknown>
  13:     0x7f1de094ce63 - <unknown>
  14:     0x7f1de084a1e5 - <unknown>
  15:     0x7f1de0db0086 - <unknown>
  16:     0x7f1de08b320e - <unknown>
  17:     0x7f1de0da6102 - <unknown>
  18:     0x7f1ddf4c6183 - start_thread
  19:     0x7f1ddfee237c - clone
  20:                0x0 - <unknown>
2016-10-16 00:55:05 UTC Finishing work, please wait...

Software environment:

Docker version 1.11.2, build 3a84010

Hardware environment:

QNAP NAS TS253A
8GB RAM, 7GB allocated to docker container, using approx 2.5GB while running at this point with 5GB free on the machine.
2.8TB available HDD space (hdd, not ssd).

F2-bug 🐞 M4-core ⛓

Source

benjaminion

All 13 comments

Most probably duplicate of #2603 but thank you so much for such detailed information about environment.

tomusdrw on 17 Oct 2016

Most probably duplicate of #2603

I don't think so (which is why I filed it):

different error message
happened during a clean, uninterrupted sync from scratch with no OOM stoppage or any other external reason for the DB to have become corrupted.

I've seen this one several times now on clean syncs - same error message. I've also seen #2603 a few times, but that one only after stoppages due to OOM.

These point to it being different for me.

benjaminion on 17 Oct 2016

i meet the error, too. and in the meantime, my disk crazy growth. until to 100%, it crash that. when it crash, the space is back.

zhumingu on 17 Oct 2016

Just happened again. Exactly the same error, but different block number. As above, a clean sync from scratch with no stoppages or resource constraints. Fairly certain this differs from #2603.

2016-10-17 15:51:25 UTC Block import failed for #1548592 (0a49…3f49)
Error: Trie(IncompleteDatabase(268cd1dafb99fdacfd620648a917e36d1ec106353eb41e10dd6ac83fda498f69))
thread 'IO Worker #2' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore/result.rs:788

benjaminion on 17 Oct 2016

Fair enough - reopening to give it proper investigation.

tomusdrw on 17 Oct 2016

@benjaminion @zhumingu 1.3.9 includes a fix that might help with this. Please let us know if you still see this issue with a new version.

arkpar on 21 Oct 2016

@tomusdrw Can you help me what should I do with this issue?

Should I wait for the fixed? Please guide me about it, at least I have an idea what would be my next step...

kenzaka07 on 24 Oct 2016

@kenzaka07 the issue should be fixed in the upcoming 1.3.10-stable and 1.4.0-beta releases see (#2603). this will be released in the next few days. in the meantime, you might want to try compiling from the beta or master branches in the repository.

gavofyork on 27 Oct 2016

👍1

Still getting these crashes on 1.4.1-beta while trying a fresh sync on classic.

2016-11-09 22:06:31  Parity
  version Parity/v1.4.1-beta-a7ddb9e-20161109/x86_64-linux-gnu/rustc1.12.1

Ubuntu 14.04, 16Gb, 1Tb HDD.

2016-11-09 22:06:14  Syncing #1946277 a1cf…09fa    134 blk/s  462 tx/s  20 Mgas/s     751+ 8109 Qed   #1955355    9/ 9/25 peers     21 MiB db    6 MiB chain   41 MiB queue    5 MiB sync
2016-11-09 22:06:24  Syncing #1946834 9c85…3d92     56 blk/s  215 tx/s   9 Mgas/s       0+ 9899 Qed   #1956733    0/10/25 peers     21 MiB db    7 MiB chain   51 MiB queue    5 MiB sync
thread 'IO Worker #0' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore/result.rs:788
stack backtrace:
thread 'IO Worker #3' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore/result.rs:788
thread 'IO Worker #1' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore/result.rs:788
thread 'IO Worker #2' panicked at 'DB flush failed.: "Corruption: block checksum mismatch"', ../src/libcore/result.rs:788
   1:     0x7fd99aef6819 - <unknown>
   2:     0x7fd99aeff28c - <unknown>
   3:     0x7fd99aefe159 - <unknown>
   4:     0x7fd99aefe848 - <unknown>
   5:     0x7fd99aefe6a2 - <unknown>
   6:     0x7fd99aefe610 - <unknown>
   7:     0x7fd99aefe591 - <unknown>
   8:     0x7fd99af3bc8f - <unknown>
   9:     0x7fd99a91ac28 - <unknown>
  10:     0x7fd99aa394d4 - <unknown>
  11:     0x7fd99a8f97b1 - <unknown>
  12:     0x7fd99af06d76 - <unknown>
  13:     0x7fd99a92e1a9 - <unknown>
  14:     0x7fd99aefcdf2 - <unknown>
  15:     0x7fd999377183 - start_thread
  16:     0x7fd999d9337c - clone
  17:                0x0 - <unknown>
stack backtrace:

o0ragman0o on 9 Nov 2016

What happened with this issue? i still have this error!!

panicape on 24 Jul 2017

Reset your database with parity db kill and start a fresh sync if you experience this issue. It will take a couple of minutes to fetch the latest chain tip again, but you wont see this again.

5chdn on 24 Jul 2017

i did it a few times and still happening to me

panicape on 24 Jul 2017

For me it turned out to be faulty memory. Memtest 86+ showed some errors. Since replacing the memory I've not had an error in 8 months of continuous running. Worth a check.

benjaminion on 24 Jul 2017

😄1

Was this page helpful?

0 / 5 - 0 ratings