Jormungandr: Tokio Thread Panicked: Validated Block Must Be Unique

Created on 30 Jan 2020  路  8Comments  路  Source: input-output-hk/jormungandr

Describe the bug
Shortly after bootstrapping I received:
thread 'tokio-runtime-worker-1' panicked at 'validated block must be unique', src/libcore/option.rs:1185:5
note: run with RUST_BACKTRACE=1 environment variable to display a backtrace.
thread 'main' panicked at 'internal error: entered unreachable code', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.1.22/src/runtime/threadpool/mod.rs:296:26

Mandatory Information

  1. jcli --full-version output;
    jcli 0.8.8 (master-4687999, release, linux [x86_64]) - [rustc 1.40.0 (73528e339 2019-12-16)]
  1. jormungandr --full-version output;
    jormungandr 0.8.8 (master-4687999, release, linux [x86_64]) - [rustc 1.40.0 (73528e339 2019-12-16)]

To Reproduce
Steps to reproduce the behavior:

  1. Upgrade to the above version of Jormungandr
  2. Bootstrap
  3. See error

Expected behavior
Don't Panic!

Additional context
Running CentOS 7

bug Priority - High subsys-storage

Most helpful comment

Another case that this may happen is when our tip (main chain) was on a fork when the node was stopped. Hence the last block we accepted to extend/switch our main chain may not be known to the node we will bootstrap from, causing a possibl scenario that upon start after connecting to the bootstrapping node, our node may start bootstrapping from epoch 0, the second block (genesis being the first and is retrieved from storage), but since that block is already in storage, it falls in the situation @mzabaluev explained.
The logs @mark-stopka provided show this case.

All 8 comments

Jan 30 03:05:09 redacted jormungandr[27793]: Jan 30 03:05:09.799 WARN initial bootstrap failed, error: PullRequestFailed { source: Error { code: Unknown, source: Status { code: Unknown, message: "connection error: broken pipe" } } }, peer_addr: 52.8.15.52:3000, task: bootstrap
Jan 30 03:05:09 redacted jormungandr[27793]: Jan 30 03:05:09.969 WARN unable to reach peer for initial bootstrap, reason: HTTP/2.0 connection error, peer_addr: 13.114.196.228:3000, task: bootstrap
Jan 30 03:05:10 redacted jormungandr[27793]: thread 'tokio-runtime-worker-2' panicked at 'validated block must be unique', src/libcore/option.rs:1185:5
Jan 30 03:05:10 redacted jormungandr[27793]: stack backtrace:
Jan 30 03:05:10 redacted jormungandr[27793]: 0: backtrace::backtrace::libunwind::trace
Jan 30 03:05:10 redacted jormungandr[27793]: at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
Jan 30 03:05:10 redacted jormungandr[27793]: 1: backtrace::backtrace::trace_unsynchronized
Jan 30 03:05:10 redacted jormungandr[27793]: at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
Jan 30 03:05:10 redacted jormungandr[27793]: 2: std::sys_common::backtrace::_print_fmt
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/sys_common/backtrace.rs:77
Jan 30 03:05:10 redacted jormungandr[27793]: 3: ::fmt
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/sys_common/backtrace.rs:61
Jan 30 03:05:10 redacted jormungandr[27793]: 4: core::fmt::write
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libcore/fmt/mod.rs:1028
Jan 30 03:05:10 redacted jormungandr[27793]: 5: std::io::Write::write_fmt
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/io/mod.rs:1412
Jan 30 03:05:10 redacted jormungandr[27793]: 6: std::sys_common::backtrace::_print
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/sys_common/backtrace.rs:65
Jan 30 03:05:10 redacted jormungandr[27793]: 7: std::sys_common::backtrace::print
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/sys_common/backtrace.rs:50
Jan 30 03:05:10 redacted jormungandr[27793]: 8: std::panicking::default_hook::{{closure}}
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/panicking.rs:188
Jan 30 03:05:10 redacted jormungandr[27793]: 9: std::panicking::default_hook
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/panicking.rs:205
Jan 30 03:05:10 redacted jormungandr[27793]: 10: std::panicking::rust_panic_with_hook
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/panicking.rs:464
Jan 30 03:05:10 redacted jormungandr[27793]: 11: std::panicking::continue_panic_fmt
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/panicking.rs:373
Jan 30 03:05:10 redacted jormungandr[27793]: 12: rust_begin_unwind
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/panicking.rs:302
Jan 30 03:05:10 redacted jormungandr[27793]: 13: core::panicking::panic_fmt
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libcore/panicking.rs:139
Jan 30 03:05:10 redacted jormungandr[27793]: 14: core::option::expect_failed
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libcore/option.rs:1185
Jan 30 03:05:10 redacted jormungandr[27793]: 15: as futures::future::Future>::poll
Jan 30 03:05:10 redacted jormungandr[27793]: 16: as futures::future::Future>::poll
Jan 30 03:05:10 redacted jormungandr[27793]: 17: as futures::future::Future>::poll
Jan 30 03:05:10 redacted jormungandr[27793]: 18: as futures::future::Future>::poll
Jan 30 03:05:10 redacted jormungandr[27793]: 19: as futures::future::Future>::poll
Jan 30 03:05:10 redacted jormungandr[27793]: 20: futures::task_impl::std::set
Jan 30 03:05:10 redacted jormungandr[27793]: 21: tokio_threadpool::worker::Worker::run_task
Jan 30 03:05:10 redacted jormungandr[27793]: 22: tokio_threadpool::worker::Worker::run
Jan 30 03:05:10 redacted jormungandr[27793]: 23: tokio_reactor::with_default
Jan 30 03:05:10 redacted jormungandr[27793]: 24: tokio::runtime::threadpool::builder::Builder::build::{{closure}}
Jan 30 03:05:10 redacted jormungandr[27793]: note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.
Jan 30 03:05:10 redacted jormungandr[27793]: thread 'main' panicked at 'internal error: entered unreachable code', /cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.1.22/src/runtime/threadpool/mod.rs:296:26
Jan 30 03:05:10 redacted jormungandr[27793]: stack backtrace:
Jan 30 03:05:10 redacted jormungandr[27793]: 0: backtrace::backtrace::libunwind::trace
Jan 30 03:05:10 redacted jormungandr[27793]: at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/libunwind.rs:88
Jan 30 03:05:10 redacted jormungandr[27793]: 1: backtrace::backtrace::trace_unsynchronized
Jan 30 03:05:10 redacted jormungandr[27793]: at ./cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.40/src/backtrace/mod.rs:66
Jan 30 03:05:10 redacted jormungandr[27793]: 2: std::sys_common::backtrace::_print_fmt
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/sys_common/backtrace.rs:77
Jan 30 03:05:10 redacted jormungandr[27793]: 3: ::fmt
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/sys_common/backtrace.rs:61
Jan 30 03:05:10 redacted jormungandr[27793]: 4: core::fmt::write
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libcore/fmt/mod.rs:1028
Jan 30 03:05:10 redacted jormungandr[27793]: 5: std::io::Write::write_fmt
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/io/mod.rs:1412
Jan 30 03:05:10 redacted jormungandr[27793]: 6: std::sys_common::backtrace::_print
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/sys_common/backtrace.rs:65
Jan 30 03:05:10 redacted jormungandr[27793]: 7: std::sys_common::backtrace::print
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/sys_common/backtrace.rs:50
Jan 30 03:05:10 redacted jormungandr[27793]: 8: std::panicking::default_hook::{{closure}}
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/panicking.rs:188
Jan 30 03:05:10 redacted jormungandr[27793]: 9: std::panicking::default_hook
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/panicking.rs:205
Jan 30 03:05:10 redacted jormungandr[27793]: 10: std::panicking::rust_panic_with_hook
Jan 30 03:05:10 redacted jormungandr[27793]: at src/libstd/panicking.rs:464
Jan 30 03:05:10 redacted jormungandr[27793]: 11: std::panicking::begin_panic
Jan 30 03:05:10 redacted jormungandr[27793]: 12: as futures::future::Future>::poll
Jan 30 03:05:10 redacted jormungandr[27793]: 13: futures::task_impl::Spawn::poll_future_notify
Jan 30 03:05:10 redacted jormungandr[27793]: 14: std::thread::local::LocalKey::with
Jan 30 03:05:10 redacted jormungandr[27793]: 15: jormungandr::main
Jan 30 03:05:10 redacted jormungandr[27793]: 16: std::rt::lang_start::{{closure}}
Jan 30 03:05:10 redacted jormungandr[27793]: 17: main
Jan 30 03:05:10 redacted jormungandr[27793]: note: Some details are omitted, run with RUST_BACKTRACE=full for a verbose backtrace.

Same issue, CentOS 8, the issue occurred after I had to restart due to #1678 / #1626...

Can you post jormungandr log with log level debug?

Sure;

Jan 30 09:37:00.425 INFO loading from storage, currently at bab353f1-00023669-47.16679 ..., task: bootstrap
Jan 30 09:37:01.090 INFO Loaded from storage tip is : 6581e6d1-0002384b-47.22462, task: bootstrap
Jan 30 09:37:01.090 INFO connecting to bootstrap peer 13.56.0.226:3000, peer_addr: 13.56.0.226:3000, task: bootstrap
Jan 30 09:37:01.234 WARN unable to reach peer for initial bootstrap, reason: HTTP/2.0 connection error, peer_addr: 13.56.0.226:3000, task: bootstrap
Jan 30 09:37:01.234 INFO connecting to bootstrap peer 18.184.35.137:3000, peer_addr: 18.184.35.137:3000, task: bootstrap
Jan 30 09:37:01.251 WARN unable to reach peer for initial bootstrap, reason: HTTP/2.0 connection error, peer_addr: 18.184.35.137:3000, task: bootstrap
Jan 30 09:37:01.251 INFO connecting to bootstrap peer 3.125.183.71:3000, peer_addr: 3.125.183.71:3000, task: bootstrap
Jan 30 09:37:01.269 DEBG pulling blocks starting from 6581e6d1b3e28030574753fbb75c6989e8fd5d15845654e6f009c1996695f93b, peer_addr: 3.125.183.71:3000, task: bootstrap
thread 'tokio-runtime-worker-0' panicked at 'validated block must be unique', src/libcore/option.rs:1185:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
thread 'main' panicked at 'internal error: entered unreachable code', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-0.1.22/src/runtime/threadpool/mod.rs:296:26
Jan 30 09:37:01.287 DEBG validated block, block_date: 0.530, hash: d79f2082ddd12ab881f7786f376ee1dc548774efa4823efe84e028fd7e72c63f, peer_addr: 3.125.183.71:3000, task: bootstrap

Root cause analysis (cc @rinor to confirm): the blockchain tip has not been advanced past some newer blocks that have been put to storage. This is not caught by bootstrap stream validation due to a temporary workaround, neither did it fail prior to recent changes in storage that made put_block return a BlockAlreadyExists error instead of Ok(()) in case the block is already present in storage.

A workaround is to delete the storage and bootstrap from scratch. I will fix this to make a new release.

Another case that this may happen is when our tip (main chain) was on a fork when the node was stopped. Hence the last block we accepted to extend/switch our main chain may not be known to the node we will bootstrap from, causing a possibl scenario that upon start after connecting to the bootstrapping node, our node may start bootstrapping from epoch 0, the second block (genesis being the first and is retrieved from storage), but since that block is already in storage, it falls in the situation @mzabaluev explained.
The logs @mark-stopka provided show this case.

A workaround is to delete the storage and bootstrap from scratch. I will fix this to make a new release.

I can confirm the workaround worked for me.

A workaround is to delete the storage and bootstrap from scratch. I will fix this to make a new release.

We can also confirm this worked for us well on Ubuntu 18.04.

Was this page helpful?
0 / 5 - 0 ratings