Installation details
Scylla version (or git commit hash): 4.2.0-0.20201025.94597e38e2 with build-id 9057ccfab8fb951c06b8feaf25ac05ac7e60765e
Cluster size: 6
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0071fbde9ff6b2951(eu-north-1)
Test-id: 50b1886d-0b18-4543-a4e1-02233810884b
Job: longevity-cdc-100gb-4h-test
Job link: https://jenkins.scylladb.com/job/scylla-4.2/job/longevity/job/longevity-cdc-100gb-4h-test/71/
All db log link: https://cloudius-jenkins-test.s3.amazonaws.com/50b1886d-0b18-4543-a4e1-02233810884b/20201025_161629/db-cluster-50b1886d.zip
Cluster have CDC enabled, HH enabled, RBO disabled.
Scylla was stopped on node6 with command:
2020-10-25T12:47:03+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !NOTICE | sudo: centos : TTY=unknown ; PWD=/home/centos ; USER=root ; COMMAND=/bin/systemctl stop scylla-server.service
Scylla stop process started:
2020-10-25T12:47:03+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | systemd: Stopped Run Scylla Housekeeping daily mode.
2020-10-25T12:47:03+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | systemd: Stopped Run Scylla fstrim daily.
2020-10-25T12:47:03+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | systemd: Stopped Run Scylla Housekeeping restart mode.
2020-10-25T12:47:03+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | systemd: Stopping Scylla JMX...
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !NOTICE | systemd: scylla-jmx.service: main process exited, code=exited, status=143/n/a
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | systemd: Stopped Scylla JMX.
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !NOTICE | systemd: Unit scylla-jmx.service entered failed state.
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !WARNING | systemd: scylla-jmx.service failed.
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 0] compaction_manager - Asked to stop
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 0] compaction_manager - Stopping 0 ongoing compactions
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | systemd: Stopping Scylla Server...
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 0] init - Signal received; shutting down
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 0] init - Shutting down redis service
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 0] init - Shutting down redis service was successful
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 0] init - Shutting down view builder
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 0] view - Stopping view builder
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 9] compaction_manager - Asked to stop
And at this moment next error happened and coredump triggered:
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 12] view - Stopping view builder
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 7] compaction_manager - compaction info: Compaction for cdc_test/test_table was stopped due to: shutdown: stopping
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 7] compaction_manager - Stopped
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 1] compaction_manager - Asked to stop
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 1] compaction_manager - Stopping 61 ongoing compactions
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 1] view - Stopping view builder
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 2] compaction - Compacted 12 sstables to []. 2MB to 0 bytes (~0% of original) in 2623ms = 0 bytes/s. ~1536 total partitions merged to 0.
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: [shard 2] compaction_manager - compaction info: Compaction for cdc_test/test_table was stopped due to: shutdown: stopping
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !ERR | scylla: [shard 2] sstable - Mutation stream ends with unclosed partition during write, at: 0x334961d#012 0x3349930#012 0x3349db9#012 0x2e5a8dc#012 0x1352029#012 0x13db98c#012 0x140b3a1#012 0x140b883#012 0x3156e6c#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIJEEEZNS_5asyncIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEENS_6futureIJEEET_ENUl20flat_mutation_readerE_clESC_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISB_E4typeEDpNSH_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSB_DpOSK_EUlvE0_ZZNSA_14then_impl_nrvoISX_SA_EET0_SU_ENKUlvE_clEvEUlRS3_RSX_ONS_12future_stateIJEEEE_JEEE#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIJEEENS_6futureIJEE12finally_bodyIZNS_5asyncIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEES5_T_ENUl20flat_mutation_readerE_clESD_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISC_E4typeEDpNSI_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSC_DpOSL_EUlvE1_Lb0EEEZZNS5_17then_wrapped_nrvoIS5_SZ_EENSG_ISC_E4typeEOT0_ENKUlvE_clEvEUlRS3_RSZ_ONS_12future_stateIJEEEE_JEEE
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: Aborting on shard 2.
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: Backtrace:
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x0000000002eee192
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x0000000002e92710
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x0000000002e929b5
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x0000000002e92a00
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x00007f4a7e9c3a8f
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: /opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000025894
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x0000000002e5a902
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x0000000001352029
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x00000000013db98c
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x000000000140b3a1
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x000000000140b883
2020-10-25T12:47:04+00:00 longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 !INFO | scylla: 0x0000000003156e6c
Decoded backtrace:
[centos@longevity-mv-si-4d-4-2-db-node-f216a353-2 ~]$ addr2line -Cpife /usr/lib/debug/opt/scylladb/libexec/scylla-4.2.0-0.20201025.94597e38e2.x86_64.debug 0x0000000002eee192 0x0000000002e92710 0x0000000002e929b5 0x0000000002e92a00 0x00007f4a7e9c3a8f /opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4 /opt/scylladb/libreloc/libc.so.6+0x0000000000025894 0x0000000002e5a902 0x0000000001352029 0x00000000013db98c 0x000000000140b3a1 0x000000000140b883 0x0000000003156e6c
void seastar::backtrace
(inlined by) print_with_backtrace at /jenkins/workspace/scylla-4.2/build/scylla/seastar/src/core/reactor.cc:751
seastar::print_with_backtrace(char const) at /usr/include/fmt/format.h:2188
sigabrt_action at /usr/include/fmt/format.h:2188
(inlined by) operator() at /jenkins/workspace/scylla-4.2/build/scylla/seastar/src/core/reactor.cc:3451
(inlined by) _FUN at /jenkins/workspace/scylla-4.2/build/scylla/seastar/src/core/reactor.cc:3447
?? ??:0
?? ??:0
?? ??:0
seastar::on_internal_error(seastar::logger&, std::basic_string_view
sstables::mc::writer::consume_end_of_stream() at /jenkins/workspace/scylla-4.2/build/scylla/./sstables/types.hh:258
sstables::compaction::finish_new_sstable(sstables::compaction_writer
(inlined by) sstables::regular_compaction::stop_sstable_writer(sstables::compaction_writer) at /jenkins/workspace/scylla-4.2/build/scylla/sstables/compaction.cc:847
sstables::compacting_sstable_writer::consume_end_of_stream() at /jenkins/workspace/scylla-4.2/build/scylla/sstables/compaction.cc:716
(inlined by) sstables::compacting_sstable_writer::consume_end_of_stream() at /jenkins/workspace/scylla-4.2/build/scylla/sstables/compaction.cc:714
(inlined by) auto compact_mutation_state<(emit_only_live_rows)0, (compact_for_sstables)1>::consume_end_of_stream
(inlined by) compact_mutation<(emit_only_live_rows)0, (compact_for_sstables)1, sstables::compacting_sstable_writer, noop_compacted_fragments_consumer>::consume_end_of_stream() at /jenkins/workspace/scylla-4.2/build/scylla/./mutation_compactor.hh:521
(inlined by) stable_flattened_mutations_consumer
(inlined by) auto flat_mutation_reader::impl::consume_in_thread
(inlined by) auto flat_mutation_reader::consume_in_thread
(inlined by) seastar::future<> sstables::compaction::setup
void std::__invoke_impl
(inlined by) std::__invoke_result
(inlined by) _ZSt12__apply_implIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIJEEET_ENUl20flat_mutation_readerE_clES8_EUlvE_St5tupleIJEEJEEDcOS7_OT0_St16integer_sequenceImJXspT1_EEE at /usr/include/c++/10/tuple:1723
(inlined by) _ZSt5applyIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIJEEET_ENUl20flat_mutation_readerE_clES8_EUlvE_St5tupleIJEEEDcOS7_OT0_ at /usr/include/c++/10/tuple:1734
(inlined by) seastar::future<> seastar::futurize
(inlined by) seastar::futurize
(inlined by) seastar::noncopyable_function
seastar::noncopyable_function
Coredump file:
2020-10-25 12:47:04.000: (CoreDumpEvent Severity.ERROR): node=Node longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6 [13.48.130.39 | 10.0.0.244] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.997.ba3a1abdc97f44d7981494d78ee941d8.1466.1603630024000000/core.scylla.997.ba3a1abdc97f44d7981494d78ee941d8.1466.1603630024000000.gz
backtrace= PID: 1466 (scylla)
UID: 997 (scylla)
GID: 1001 (scylla)
Signal: 6 (ABRT)
Timestamp: Sun 2020-10-25 12:47:04 UTC (3min 43s ago)
Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --experimental-features cdc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 1-7,9-15 --lock-memory=1
Executable: /opt/scylladb/libexec/scylla
Control Group: /
Boot ID: ba3a1abdc97f44d7981494d78ee941d8
Machine ID: 93f219319dd5bdb42d9f1c8f2e23d329
Hostname: longevity-cdc-100gb-4h-4-2-db-node-50b1886d-6
Coredump: /var/lib/systemd/coredump/core.scylla.997.ba3a1abdc97f44d7981494d78ee941d8.1466.1603630024000000
Message: Process 1466 (scylla) of user 997 dumped core.
Stack trace of thread 1468:
#0 0x00007f4a7df769e5 raise (libc.so.6)
#1 0x00007f4a7df5f94d abort (libc.so.6)
#2 0x0000000002e5a903 _ZN7seastar17on_internal_errorERNS_6loggerESt17basic_string_viewIcSt11char_traitsIcEE (scylla)
#3 0x000000000135202a _ZN8sstables2mc6writer21consume_end_of_streamEv (scylla)
#4 0x00000000013db98d _ZN8sstables10compaction18finish_new_sstableEPNS_17compaction_writerE (scylla)
#5 0x000000000140b3a2 _ZZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIJEEET_ENUl20flat_mutation_readerE_clES7_ENUlvE_clEv (scylla)
#6 0x000000000140b884 _ZSt13__invoke_implIvZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIJEEET_ENUl20flat_mutation_readerE_clES8_EUlvE_JEES7_St14__invoke_otherOT0_DpOT1_ (scylla)
#7 0x0000000003156e6d _ZNK7seastar20noncopyable_functionIFvvEEclEv (scylla)
Stack trace of thread 1483:
#0 0x00007f4a7e9c29ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f4a7e9b8432 start_thread (libpthread.so.0)
#5 0x00007f4a7e03b913 __clone (libc.so.6)
Stack trace of thread 1482:
#0 0x00007f4a7e9c29ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f4a7e9b8432 start_thread (libpthread.so.0)
#5 0x00007f4a7e03b913 __clone (libc.so.6)
Stack trace of thread 1469:
#0 0x0000000000e7689f _ZNKSt10_HashtableImSt4pairIKmN7seastar13lw_shared_ptrIN8sstables7sstableEEEESaIS7_ENSt8__detail10_Select1stESt8equal_toImESt4hashImENS9_18_Mod_range_hashingENS9_20_Default_ranged_hashENS9_20_Prime_rehash_policyENS9_17_Hashtable_traitsILb0ELb0ELb1EEEE15_M_bucket_beginEm (scylla)
#1 0x00000000014731a3 _ZN18compaction_manager14get_candidatesERK5table (scylla)
#2 0x000000000147ad33 _ZZZZN18compaction_manager6submitEP5tableENUlvE_clEvENUlvE_clEvENUlvE_clEv (scylla)
#3 0x000000000147c88d __invoke_impl<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, compaction_manager::submit(column_family*)::<lambda()> mutable::<lambda()> mutable::<lambda()>&> (scylla)
#4 0x0000000002e8fab8 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
#5 0x0000000002e8fe2f _ZN7seastar7reactor14run_some_tasksEv (scylla)
#6 0x0000000002ec6f0e _ZN7seastar7reactor14run_some_tasksEv (scylla)
#7 0x0000000002ed690b _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
#8 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#9 0x00007f4a7e9b8432 start_thread (libpthread.so.0)
#10 0x00007f4a7e03b913 __clone (libc.so.6)
Stack trace of thread 1485:
#0 0x00007f4a7e9c29ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f4a7e9b8432 start_thread (libpthread.so.0)
#5 0x00007f4a7e03b913 __clone (libc.so.6)
Stack trace of thread 1487:
#0 0x00007f4a7e9c29ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f4a7e9b8432 start_thread (libpthread.so.0)
#5 0x00007f4a7e03b913 __clone (libc.so.6)
Stack trace of thread 1484:
#0 0x00007f4a7e9c29ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f4a7e9b8432 start_thread (libpthread.so.0)
#5 0x00007f4a7e03b913 __clone (libc.so.6)
Stack trace of thread 1488:
#0 0x00007f4a7e9c29ac read (libpthread.so.0)
#1 0x00
download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.997.ba3a1abdc97f44d7981494d78ee941d8.1466.1603630024000000/core.scylla.997.ba3a1abdc97f44d7981494d78ee941d8.1466.1603630024000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.997.ba3a1abdc97f44d7981494d78ee941d8.1466.1603630024000000.gz
The same happened during next nemesis when scylla was stopped on another node - node5. Unfortunately, log on node was truncated and only coredump is available:
2020-10-25 13:46:22.000: (CoreDumpEvent Severity.ERROR): node=Node longevity-cdc-100gb-4h-4-2-db-node-50b1886d-5 [13.49.78.18 | 10.0.3.26] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.997.0efb7077a88d40f7b0f5037a0c39e1b3.49481.1603633582000000/core.scylla.997.0efb7077a88d40f7b0f5037a0c39e1b3.49481.1603633582000000.gz
backtrace= PID: 49481 (scylla)
UID: 997 (scylla)
GID: 1001 (scylla)
Signal: 6 (ABRT)
Timestamp: Sun 2020-10-25 13:46:22 UTC (2min 35s ago)
Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --experimental-features cdc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 1-7,9-15 --lock-memory=1
Executable: /opt/scylladb/libexec/scylla
Control Group: /
Boot ID: 0efb7077a88d40f7b0f5037a0c39e1b3
Machine ID: 93f219319dd5bdb42d9f1c8f2e23d329
Hostname: longevity-cdc-100gb-4h-4-2-db-node-50b1886d-5
Coredump: /var/lib/systemd/coredump/core.scylla.997.0efb7077a88d40f7b0f5037a0c39e1b3.49481.1603633582000000
Message: Process 49481 (scylla) of user 997 dumped core.
Stack trace of thread 49490:
#0 0x00007f88cf3059e5 raise (libc.so.6)
#1 0x00007f88cf2ee94d abort (libc.so.6)
#2 0x0000000002e5a903 _ZN7seastar17on_internal_errorERNS_6loggerESt17basic_string_viewIcSt11char_traitsIcEE (scylla)
#3 0x000000000135202a _ZN8sstables2mc6writer21consume_end_of_streamEv (scylla)
#4 0x00000000013db98d _ZN8sstables10compaction18finish_new_sstableEPNS_17compaction_writerE (scylla)
#5 0x000000000140b3a2 _ZZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIJEEET_ENUl20flat_mutation_readerE_clES7_ENUlvE_clEv (scylla)
#6 0x000000000140b884 _ZSt13__invoke_implIvZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIJEEET_ENUl20flat_mutation_readerE_clES8_EUlvE_JEES7_St14__invoke_otherOT0_DpOT1_ (scylla)
#7 0x0000000003156e6d _ZNK7seastar20noncopyable_functionIFvvEEclEv (scylla)
Stack trace of thread 49504:
#0 0x00007f88cfd519ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f88cfd47432 start_thread (libpthread.so.0)
#5 0x00007f88cf3ca913 __clone (libc.so.6)
Stack trace of thread 49496:
#0 0x00007f88cfd519ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f88cfd47432 start_thread (libpthread.so.0)
#5 0x00007f88cf3ca913 __clone (libc.so.6)
Stack trace of thread 49497:
#0 0x00007f88cfd519ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f88cfd47432 start_thread (libpthread.so.0)
#5 0x00007f88cf3ca913 __clone (libc.so.6)
Stack trace of thread 49498:
#0 0x00007f88cfd519ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f88cfd47432 start_thread (libpthread.so.0)
#5 0x00007f88cf3ca913 __clone (libc.so.6)
Stack trace of thread 49500:
#0 0x00007f88cfd519ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f88cfd47432 start_thread (libpthread.so.0)
#5 0x00007f88cf3ca913 __clone (libc.so.6)
Stack trace of thread 49482:
#0 0x0000000000e76996 _ZNKSt8__detail18_Mod_range_hashingclEmm (scylla)
#1 0x00000000014731a3 _ZN18compaction_manager14get_candidatesERK5table (scylla)
#2 0x000000000147ad33 _ZZZZN18compaction_manager6submitEP5tableENUlvE_clEvENUlvE_clEvENUlvE_clEv (scylla)
#3 0x000000000147c88d __invoke_impl<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, compaction_manager::submit(column_family*)::<lambda()> mutable::<lambda()> mutable::<lambda()>&> (scylla)
#4 0x0000000002e8fab8 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
#5 0x0000000002e8fe2f _ZN7seastar7reactor14run_some_tasksEv (scylla)
#6 0x0000000002ec6f0e _ZN7seastar7reactor14run_some_tasksEv (scylla)
#7 0x0000000002ed690b _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
#8 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#9 0x00007f88cfd47432 start_thread (libpthread.so.0)
#10 0x00007f88cf3ca913 __clone (libc.so.6)
Stack trace of thread 49495:
#0 0x00007f88cfd519ac read (libpthread.so.0)
#1 0x000000000311e7a7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311ea08 operator() (scylla)
#3 0x0000000002e5ab8e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f88cfd47432 start_thre
download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.997.0efb7077a88d40f7b0f5037a0c39e1b3.49481.1603633582000000/core.scylla.997.0efb7077a88d40f7b0f5037a0c39e1b3.49481.1603633582000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.997.0efb7077a88d40f7b0f5037a0c39e1b3.49481.1603633582000000.gz
Node6 log:
system.log.tar.gz
Node5 log:
node5.log.zip
The same nemesises: StopStartScylla, ValidateHintedHandoffShortDowntime (during these nemesises scylla stopped and started in several minutes with systemctl stop/start commands) - were successfully run on another jobs and not triggered the coredump or error for same scylla version.
Similar issue happened during MultipleHardRebootNode nemesis, longevity-cdc-100gb-4h-test
Scylla version: 666.development-0.20201025.c518c1de1 with build-id 9212462fe70187b59afec0322cc758291cf9058a
ami: ami-0998c746523893a6c
During node stopping got error "Mutation stream ends with unclosed partition during write":
2020-10-26T02:06:00+00:00 longevity-cdc-100gb-4h-master-db-node-df4efebd-3 !INFO | scylla: [shard 0] gossip - Announcing shutdown
2020-10-26T02:06:00+00:00 longevity-cdc-100gb-4h-master-db-node-df4efebd-3 !INFO | scylla: [shard 0] storage_service - Node 10.0.2.144 state jump to normal
2020-10-26T02:06:07+00:00 longevity-cdc-100gb-4h-master-db-node-df4efebd-3 !INFO | scylla: [shard 0] compaction_manager - Stopped
2020-10-26T02:06:07+00:00 longevity-cdc-100gb-4h-master-db-node-df4efebd-3 !INFO | scylla: [shard 5] compaction_manager - Stopped
2020-10-26T02:06:07+00:00 longevity-cdc-100gb-4h-master-db-node-df4efebd-3 !INFO | scylla: [shard 8] compaction_manager - Stopped
2020-10-26T02:06:07+00:00 longevity-cdc-100gb-4h-master-db-node-df4efebd-3 !ERR | scylla: [shard 3] sstable - Mutation stream ends with unclosed partition during write, at: 0x2fb6e0d#012 0x2fb7060#012 0x2fb74c9#012 0x2b226f
f#012 0x128089f#012 0x13141ec#012 0x1344121#012 0x13448f3#012 0x2df644c#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_5asyncIZZN8sstables10compaction5setupI33noop_compacted_fragments_c
onsumerEENS_6futureIvEET_ENUl20flat_mutation_readerE_clESC_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISB_E4typeEDpNSH_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSB_DpOSK_EUlvE0_ZNSA_14then_impl_nrvoISX_SA_EET0_SU_EUlOS3_RSX_O
NS_12future_stateINS1_9monostateEEEE_vEE#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIvEENS_6futureIvE12finally_bodyIZNS_5asyncIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEES5_T
_ENUl20flat_mutation_readerE_clESD_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISC_E4typeEDpNSI_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSC_DpOSL_EUlvE1_Lb0EEEZNS5_17then_wrapped_nrvoIS5_SZ_EENSG_ISC_E4typeEOT0_EUlOS3_RSZ_ONS
_12future_stateINS1_9monostateEEEE_vEE
And Aborting on shard:
2020-10-26T02:06:07+00:00 longevity-cdc-100gb-4h-master-db-node-df4efebd-3 !INFO | scylla: Aborting on shard 3.
Backtrace:
0x0000000002b5a422
0x0000000002b5aab0
0x0000000002b5ad55
0x0000000002b5ada0
0x00007f0dd14fca8f
/opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4
/opt/scylladb/libreloc/libc.so.6+0x0000000000025894
0x0000000002b2271c
0x000000000128089f
0x00000000013141ec
0x0000000001344121
0x00000000013448f3
0x0000000002df644c
Decoded:
void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at reactor.cc:?
seastar::print_with_backtrace(seastar::backtrace_buffer&) at reactor.cc:?
(inlined by) print_with_backtrace at ./build/release/seastar/./seastar/src/core/reactor.cc:752
seastar::print_with_backtrace(char const*) at reactor.cc:?
void seastar::install_oneshot_signal_handler<6, &seastar::sigabrt_action>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at reactor.cc:?
(inlined by) operator() at ./build/release/seastar/./seastar/src/core/reactor.cc:3466
(inlined by) _FUN at ./build/release/seastar/./seastar/src/core/reactor.cc:3462
?? ??:0
?? ??:0
?? ??:0
seastar::on_internal_error(seastar::logger&, std::basic_string_view<char, std::char_traits<char> >) at ./build/release/seastar/./seastar/src/core/on_internal_error.cc:39
sstables::mc::writer::consume_end_of_stream() at writer.cc:?
sstables::regular_compaction::stop_sstable_writer(sstables::compaction_writer*) at compaction.cc:?
(inlined by) sstables::regular_compaction::stop_sstable_writer(sstables::compaction_writer*) at ./sstables/compaction.cc:918
sstables::compacting_sstable_writer::consume_end_of_stream() at ./sstables/compaction.cc:787
(inlined by) sstables::compacting_sstable_writer::consume_end_of_stream() at ./sstables/compaction.cc:785
(inlined by) auto compact_mutation_state<(emit_only_live_rows)0, (compact_for_sstables)1>::consume_end_of_stream<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer>(sstables::compacting_sstable_writer&, noop_compacted_fragments_consumer&) at ././mutation_compactor.hh:410
(inlined by) compact_mutation<(emit_only_live_rows)0, (compact_for_sstables)1, sstables::compacting_sstable_writer, noop_compacted_fragments_consumer>::consume_end_of_stream() at ././mutation_compactor.hh:521
(inlined by) stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >::consume_end_of_stream() at ././mutation_reader.hh:352
(inlined by) auto flat_mutation_reader::impl::consume_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::filter>(stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::filter, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ././flat_mutation_reader.hh:274
(inlined by) auto flat_mutation_reader::consume_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::filter>(stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::filter, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ././flat_mutation_reader.hh:383
(inlined by) seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}::operator()() at ./sstables/compaction.cc:612
void std::__invoke_impl<void, seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(std::__invoke_other, seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&) at /usr/include/c++/10/bits/invoke.h:60
(inlined by) std::__invoke_result<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>::type std::__invoke<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, (seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&)...) at /usr/include/c++/10/bits/invoke.h:95
(inlined by) decltype(auto) std::__apply_impl<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}, std::tuple<>>(seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, std::tuple<>&&, std::integer_sequence<unsigned long>) at /usr/include/c++/10/tuple:1723
(inlined by) decltype(auto) std::apply<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}, std::tuple<> >(seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, std::tuple<>&&) at /usr/include/c++/10/tuple:1734
(inlined by) seastar::future<void> seastar::futurize<void>::apply<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, std::tuple<>&&) at ././seastar/include/seastar/core/future.hh:2099
(inlined by) seastar::futurize<std::result_of<std::decay<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>::type ()>::type>::type seastar::async<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(seastar::thread_attributes, seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, (std::decay<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>&&)...)::{lambda()#1}::operator()() const at ././seastar/include/seastar/core/thread.hh:258
(inlined by) seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::futurize<std::result_of<std::decay<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>::type ()>::type>::type seastar::async<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(seastar::thread_attributes, seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, (std::decay<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>&&)...)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) at ././seastar/include/seastar/util/noncopyable_function.hh:116
seastar::noncopyable_function<void ()>::operator()() const at /usr/include/c++/10/bits/basic_string.h:323
(inlined by) seastar::thread_context::main() at ./build/release/seastar/./seastar/src/core/thread.cc:297
Coredump:
corefile_url=https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.0b911784713f4858874cf097403e5d56.68853.1603677967000000/core.scylla.996.0b911784713f4858874cf097403e5d56.68853.1603677967000000.gz
backtrace= PID: 68853 (scylla)
UID: 996 (scylla)
GID: 1001 (scylla)
Signal: 6 (ABRT)
Timestamp: Mon 2020-10-26 02:06:07 UTC (10min ago)
Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --experimental-features cdc --log-to-syslog 1 --log-to-stdout 0 --defa
ult-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 1-7,9-15 --lock-memory=1
Executable: /opt/scylladb/libexec/scylla
Control Group: /
Boot ID: 0b911784713f4858874cf097403e5d56
Machine ID: ec239da046efe33c7665d4508a7d0a61
Hostname: longevity-cdc-100gb-4h-master-db-node-df4efebd-3
Coredump: /var/lib/systemd/coredump/core.scylla.996.0b911784713f4858874cf097403e5d56.68853.1603677967000000
Message: Process 68853 (scylla) of user 996 dumped core.
Stack trace of thread 68856:
#0 0x00007f0dd07a89e5 raise (libc.so.6)
#1 0x00007f0dd079194d abort (libc.so.6)
#2 0x0000000002b2271d _ZN7seastar17on_internal_errorERNS_6loggerESt17basic_string_viewIcSt11char_traitsIcEE (scylla)
#3 0x00000000012808a0 _ZN8sstables2mc6writer21consume_end_of_streamEv (scylla)
#4 0x00000000013141ed _ZN8sstables10compaction18finish_new_sstableEPNS_17compaction_writerE (scylla)
#5 0x0000000001344122 _ZZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIvEET_ENUl20flat_mutation_readerE_clES7_ENUlvE_clEv (scylla)
#6 0x00000000013448f4 _ZSt13__invoke_implIvZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIvEET_ENUl20flat_mutation_readerE_clES8_EUlvE_JEES7_St14__invoke_otherOT0_DpOT1_ (scylla)
#7 0x0000000002df644d _ZNK7seastar20noncopyable_functionIFvvEEclEv (scylla)
Stack trace of thread 68874:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scylla)
#3 0x0000000002b2278e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f0dd14f1432 start_thread (libpthread.so.0)
#5 0x00007f0dd086d913 __clone (libc.so.6)
Stack trace of thread 68871:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scylla)
#3 0x0000000002b2278e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f0dd14f1432 start_thread (libpthread.so.0)
#5 0x00007f0dd086d913 __clone (libc.so.6)
Stack trace of thread 68867:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scylla)
#3 0x0000000002b2278e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f0dd14f1432 start_thread (libpthread.so.0)
#5 0x00007f0dd086d913 __clone (libc.so.6)
Stack trace of thread 68868:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scylla)
#3 0x0000000002b2278e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f0dd14f1432 start_thread (libpthread.so.0)
#5 0x00007f0dd086d913 __clone (libc.so.6)
Stack trace of thread 68875:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scylla)
#3 0x0000000002b2278e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f0dd14f1432 start_thread (libpthread.so.0)
#5 0x00007f0dd086d913 __clone (libc.so.6)
Stack trace of thread 68876:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scylla)
#3 0x0000000002b2278e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f0dd14f1432 start_thread (libpthread.so.0)
#5 0x00007f0dd086d913 __clone (libc.so.6)
Stack trace of thread 68879:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scylla)
#3 0x0000000002b2278e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f0dd14f1432 start_thread (libpthread.so.0)
#5 0x00007f0dd086d913 __clone (libc.so.6)
Stack trace of thread 68870:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scylla)
#3 0x0000000002b2278e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f0dd14f1432 start_thread (libpthread.so.0)
#5 0x00007f0dd086d913 __clone (libc.so.6)
Stack trace of thread 68878:
#0 0x00007f0dd14fb9ac read (libpthread.so.0)
#1 0x0000000002dc8ee7 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dc9158 operator() (scyll
download_instructions:
gsutil cp gs://upload.scylladb.com/core.scylla.996.0b911784713f4858874cf097403e5d56.68853.1603677967000000/core.scylla.996.0b911784713f4858874cf097403e5d56.68853.1603677967000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.0b911784713f4858874cf097403e5d56.68853.1603677967000000.gz
Jenkins job
Get cluster logs: https://cloudius-jenkins-test.s3.amazonaws.com/df4efebd-7be3-4a96-ad7e-1a7dda6b23c6/20201026_054321/db-cluster-df4efebd.zip
Test id: df4efebd-7be3-4a96-ad7e-1a7dda6b23c6
Scylla version : 4.2.0-0.20201026.94597e38e with build-id c34028ef570d1a9de6914d12c14c415e1cc4b90f
Issue reproduced during job: longevity-cdc-100gb-4h-test during ENOSPC nemesis. The nemesis allocate disk space up to 90% and trigger error no space left:
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !ERR | scylla: [shard 0] sstable - writer failed to close file: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !ERR | scylla: [shard 0] table - failed to write sstable /var/lib/scylla/data/cdc_test/test_table_postimage_scylla_cdc_log-95d429b01d2f11eb95e1000000000003/mc-9590-big-Data.db: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !WARNING | scylla: [shard 2] commitlog - Exception in segment reservation: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !WARNING | scylla: [shard 0] commitlog - Exception in segment reservation: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !WARNING | scylla: [shard 6] commitlog - Exception in segment reservation: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !ERR | scylla: [shard 0] sstable - writer failed to close file: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !ERR | scylla: [shard 0] table - failed to write sstable /var/lib/scylla/data/cdc_test/test_table_scylla_cdc_log-59b2f1a21d2f11ebbb27000000000008/mc-6496-big-Data.db: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !ERR | scylla: [shard 0] sstable - writer failed to close file: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !ERR | scylla: [shard 0] table - failed to write sstable /var/lib/scylla/data/cdc_test/test_table_postimage-95d402a01d2f11eb95e1000000000003/mc-4928-big-Data.db: storage_io_error (Storage I/O error: 28: No space left on device)
2020-11-02T18:44:46+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !ERR | scylla: [shard 0] sstable - writer failed to close file: storage_io_error (Storage I/O error: 28: No space left on device)
After that scylla was restarted:
2020-11-02T18:45:01+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: [shard 0] compaction - Compacted 2 sstables to [/var/lib/scylla/data/cdc_test/test_table_postimage-95d402a01d2f11eb95e1000000000003/mc-5124-big-Data.db:level=0, ]. 7MB to 4MB (~51% of original) in 1050ms = 3MB/s. ~59136 total partitions merged to 32136.
2020-11-02T18:45:01+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !NOTICE | sudo: centos : TTY=unknown ; PWD=/home/centos ; USER=root ; COMMAND=/bin/systemctl restart scylla-server.service
2020-11-02T18:45:01+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | sudo: pam_unix(sudo:session): session opened for user root by (uid=0)
2020-11-02T18:45:01+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | systemd: Stopped Run Scylla Housekeeping daily mode.
2020-11-02T18:45:01+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | systemd: Stopping Run Scylla Housekeeping daily mode.
2020-11-02T18:45:01+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | systemd: Stopped Run Scylla Housekeeping restart mode.
2020-11-02T18:45:01+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | systemd: Stopping Run Scylla Housekeeping restart mode.
2020-11-02T18:45:01+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | systemd: Stopping Scylla JMX...
During scylla was stopping coredump triggered:
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: [shard 2] compaction_manager - Asked to stop
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: Aborting on shard 1.
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: Backtrace:
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x0000000002eee112
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x0000000002e92690
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x0000000002e92935
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x0000000002e92980
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x00007f11cfddba8f
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: /opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000025894
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x0000000002e5a882
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x0000000001351fa9
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x00000000013db90c
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x000000000140b321
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x000000000140b803
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: 0x0000000003156dec
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: [shard 9] view - Stopping view builder
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: [shard 10] view - Stopping view builder
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: [shard 5] view - Stopping view builder
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !ERR | scylla: [shard 1] sstable - Mutation stream ends with unclosed partition during write, at: 0x334959d#012 0x33498b0#012 0x3349d39#012 0x2e5a85c#012 0x1351fa9#012 0x13db90c#012 0x140b321#012 0x140b803#012 0x3156dec#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIJEEEZNS_5asyncIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEENS_6futureIJEEET_ENUl20flat_mutation_readerE_clESC_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISB_E4typeEDpNSH_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSB_DpOSK_EUlvE0_ZZNSA_14then_impl_nrvoISX_SA_EET0_SU_ENKUlvE_clEvEUlRS3_RSX_ONS_12future_stateIJEEEE_JEEE#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIJEEENS_6futureIJEE12finally_bodyIZNS_5asyncIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEES5_T_ENUl20flat_mutation_readerE_clESD_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISC_E4typeEDpNSI_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSC_DpOSL_EUlvE1_Lb0EEEZZNS5_17then_wrapped_nrvoIS5_SZ_EENSG_ISC_E4typeEOT0_ENKUlvE_clEvEUlRS3_RSZ_ONS_12future_stateIJEEEE_JEEE
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: [shard 0] init - Shutting down view builder was successful
2020-11-02T18:45:02+00:00 longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 !INFO | scylla: [shard 0] init - Shutting down local storage
Coredump:
2020-11-02 18:45:02.000: (CoreDumpEvent Severity.ERROR): node=Node longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4 [13.48.13.250 | 10.0.1.0] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.997.43255116f8c1461aa78706d486552d2c.5248.1604342702000000/core.scylla.997.43255116f8c1461aa78706d486552d2c.5248.1604342702000000.gz
backtrace= PID: 5248 (scylla)
UID: 997 (scylla)
GID: 1001 (scylla)
Signal: 6 (ABRT)
Timestamp: Mon 2020-11-02 18:45:02 UTC (4min 8s ago)
Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --experimental-features cdc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 1-7,9-15 --lock-memory=1
Executable: /opt/scylladb/libexec/scylla
Control Group: /
Boot ID: 43255116f8c1461aa78706d486552d2c
Machine ID: 93f219319dd5bdb42d9f1c8f2e23d329
Hostname: longevity-cdc-100gb-4h-4-2-db-node-28d508d2-4
Coredump: /var/lib/systemd/coredump/core.scylla.997.43255116f8c1461aa78706d486552d2c.5248.1604342702000000
Message: Process 5248 (scylla) of user 997 dumped core.
Stack trace of thread 5262:
#0 0x00007f11cf38e9e5 raise (libc.so.6)
#1 0x00007f11cf37794d abort (libc.so.6)
#2 0x0000000002e5a883 _ZN7seastar17on_internal_errorERNS_6loggerESt17basic_string_viewIcSt11char_traitsIcEE (scylla)
#3 0x0000000001351faa _ZN8sstables2mc6writer21consume_end_of_streamEv (scylla)
#4 0x00000000013db90d _ZN8sstables10compaction18finish_new_sstableEPNS_17compaction_writerE (scylla)
#5 0x000000000140b322 _ZZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIJEEET_ENUl20flat_mutation_readerE_clES7_ENUlvE_clEv (scylla)
#6 0x000000000140b804 _ZSt13__invoke_implIvZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEEN7seastar6futureIJEEET_ENUl20flat_mutation_readerE_clES8_EUlvE_JEES7_St14__invoke_otherOT0_DpOT1_ (scylla)
#7 0x0000000003156ded _ZNK7seastar20noncopyable_functionIFvvEEclEv (scylla)
Stack trace of thread 5287:
#0 0x00007f11cfdda9ac read (libpthread.so.0)
#1 0x000000000311e727 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311e988 operator() (scylla)
#3 0x0000000002e5ab0e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f11cfdd0432 start_thread (libpthread.so.0)
#5 0x00007f11cf453913 __clone (libc.so.6)
Stack trace of thread 5285:
#0 0x00007f11cfdda9ac read (libpthread.so.0)
#1 0x000000000311e727 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311e988 operator() (scylla)
#3 0x0000000002e5ab0e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f11cfdd0432 start_thread (libpthread.so.0)
#5 0x00007f11cf453913 __clone (libc.so.6)
Stack trace of thread 5289:
#0 0x00007f11cfdda9ac read (libpthread.so.0)
#1 0x000000000311e727 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311e988 operator() (scylla)
#3 0x0000000002e5ab0e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f11cfdd0432 start_thread (libpthread.so.0)
#5 0x00007f11cf453913 __clone (libc.so.6)
Stack trace of thread 5276:
#0 0x00007f11cfdda9ac read (libpthread.so.0)
#1 0x000000000311e727 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311e988 operator() (scylla)
#3 0x0000000002e5ab0e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f11cfdd0432 start_thread (libpthread.so.0)
#5 0x00007f11cf453913 __clone (libc.so.6)
Stack trace of thread 5279:
#0 0x00007f11cfdda9ac read (libpthread.so.0)
#1 0x000000000311e727 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311e988 operator() (scylla)
#3 0x0000000002e5ab0e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f11cfdd0432 start_thread (libpthread.so.0)
#5 0x00007f11cf453913 __clone (libc.so.6)
Stack trace of thread 5283:
#0 0x00007f11cfdda9ac read (libpthread.so.0)
#1 0x000000000311e727 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311e988 operator() (scylla)
#3 0x0000000002e5ab0e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f11cfdd0432 start_thread (libpthread.so.0)
#5 0x00007f11cf453913 __clone (libc.so.6)
Stack trace of thread 5277:
#0 0x00007f11cfdda9ac read (libpthread.so.0)
#1 0x000000000311e727 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x000000000311e988 operator() (scylla)
#3 0x0000000002e5ab0e _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007f11cfdd0432 start_thread (libpthread.so.0)
#5 0x00007f11cf453913 __clone (libc.so.6)
Stack trace of thread 5273:
#0 0x00007f11cfdda90f __write (libpthread.so.0)
#1 0x0000000002e9962e _ZN7seastar9file_desc5writeEPKvm (scylla)
#2 0x0000000002e996e8 operator() (scylla)
#3 0x0000000002e8fa38 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
#4 0x0000000002e8fdaf _ZN7seastar7reactor14run_some_tasksEv (scylla)
#5 0x0000000002ec6e8e _ZN7seastar7reactor14run_some_tasksEv (scylla)
#6 0x0000000002ed688b _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
#7 0x0000000002e5ab0e _ZNKSt8functionIFvvEEclE
download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.997.43255116f8c1461aa78706d486552d2c.5248.1604342702000000/core.scylla.997.43255116f8c1461aa78706d486552d2c.5248.1604342702000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.997.43255116f8c1461aa78706d486552d2c.5248.1604342702000000.gz
Db log:
system.log.tar.gz
Probably the same as https://github.com/scylladb/scylla/issues/7411.
Seems similar to https://github.com/scylladb/scylla/issues/7050
supposed to be fixed by: https://github.com/scylladb/scylla/issues/7411
@aleksbykov, we may have a reproducer for this with Gemini in 4.3.
@ShlomiBalalis please refer us to the job that reproduce it so @aleksbykov can run it with the same seed number on master to see if it's solved.
Installation details
Scylla version (or git commit hash): 4.3.rc1-0.20201110.a8e372bf9 with build-id f868ccb821d87eb55b943ad771349ac58df19ad6
Cluster size: 3
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0d14a169a13ba0868 (eu-west-1)
Scenarios: gemini-3h-cdc-write
The error occurred when gemini tried to execute an insert query:
< t:2020-11-12 14:40:51,775 f:gemini_thread.py l:121 c:sdcm.gemini_thread p:ERROR > {"L":"INFO","T":"2020-11-12T14:40:50.485Z","M":"failed to apply mutation","attempts":5,"error":"[cluster = test, query = 'INSERT INTO ks1.table1 (pk0,pk1,pk2,ck0,ck1,col0,col1,col2,col3,col4,col5,col6,col7) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?) ']: Operation timed out for ks1.table1 - received only 1 responses from 2 CL=QUORUM.","errorVerbose":"Operation timed out for ks1.table1 - received only 1 responses from 2 CL=QUORUM.\n[cluster = test, query = 'INSERT INTO ks1.table1 (pk0,pk1,pk2,ck0,ck1,col0,col1,col2,col3,col4,col5,col6,col7) VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?) ']\ngithub.com/scylladb/gemini/store.(*cqlStore).doMutate\n\t/home/penberg/go/src/github.com/scylladb/gemini/store/cqlstore.go:98\ngithub.com/scylladb/gemini/store.(*cqlStore).mutate\n\t/home/penberg/go/src/github.com/scylladb/gemini/store/cqlstore.go:50\ngithub.com/scylladb/gemini/store.mutate\n\t/home/penberg/go/src/github.com/scylladb/gemini/store/store.go:165\ngithub.com/scylladb/gemini/store.delegatingStore.Mutate\n\t/home/penberg/go/src/github.com/scylladb/gemini/store/store.go:161\nmain.mutation\n\t/home/penberg/go/src/github.com/scylladb/gemini/cmd/gemini/jobs.go:235\nmain.MutationJob\n\t/home/penberg/go/src/github.com/scylladb/gemini/cmd/gemini/jobs.go:54\nmain.job.func1\n\t/home/penberg/go/src/github.com/scylladb/gemini/cmd/gemini/jobs.go:155\ngolang.org/x/sync/errgroup.(*Group).Go.func1\n\t/home/penberg/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:57\nruntime.goexit\n\t/usr/lib/golang/src/runtime/asm_amd64.s:1373"}
< t:2020-11-12 14:40:51,775 f:gemini_thread.py l:121 c:sdcm.gemini_thread p:ERROR > {"L":"WARN","T":"2020-11-12T14:40:50.485Z","N":"sample_results","M":"Errors detected. Exiting."}
< t:2020-11-12 14:40:51,775 f:gemini_thread.py l:121 c:sdcm.gemini_thread p:ERROR > {"L":"INFO","T":"2020-11-12T14:40:50.485Z","N":"pump","M":"Test run stopped. Exiting."}
< t:2020-11-12 14:40:51,775 f:gemini_thread.py l:121 c:sdcm.gemini_thread p:ERROR > {"L":"INFO","T":"2020-11-12T14:40:51.741Z","M":"result channel closed"}
From node#3:
2020-11-12T14:40:34+00:00 gemini-cdc-write-4-3-db-node-651b1690-3 !ERR | scylla: [shard 1] sstable - Mutation stream ends with unclosed partition during write, at: 0x2fb7e0d#012 0x2fb8060#012 0x2fb84c9#012 0x2b236ff#012 0x1280f4f#012 0x131490c#012 0x1344841#012 0x1345013#012 0x2df744c#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_5asyncIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEENS_6futureIvEET_ENUl20flat_mutation_readerE_clESC_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISB_E4typeEDpNSH_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSB_DpOSK_EUlvE0_ZNSA_14then_impl_nrvoISX_SA_EET0_SU_EUlOS3_RSX_ONS_12future_stateINS1_9monostateEEEE_vEE#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIvEENS_6futureIvE12finally_bodyIZNS_5asyncIZZN8sstables10compaction5setupI33noop_compacted_fragments_consumerEES5_T_ENUl20flat_mutation_readerE_clESD_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISC_E4typeEDpNSI_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSC_DpOSL_EUlvE1_Lb0EEEZNS5_17then_wrapped_nrvoIS5_SZ_EENSG_ISC_E4typeEOT0_EUlOS3_RSZ_ONS_12future_stateINS1_9monostateEEEE_vEE#012 --------#012 seastar::parallel_for_each_state#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<mutation_writer::feed_writer<mutation_writer::timestamp_based_splitting_mutation_writer>(flat_mutation_reader&&, mutation_writer::timestamp_based_splitting_mutation_writer&&)::{lambda(flat_mutation_reader&, mutation_writer::timestamp_based_splitting_mutation_writer&)#1}::operator()(flat_mutation_reader&, mutation_writer::timestamp_based_splitting_mutation_writer&) const::{lambda()#2}, true>::operator()(seastar::future<void>&&)::{lambda(auto:1&&)#1}, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, seastar::future<void>::finally_body<mutation_writer::feed_writer<mutation_writer::timestamp_based_splitting_mutation_writer>(flat_mutation_reader&&, mutation_writer::timestamp_based_splitting_mutation_writer&&)::{lambda(flat_mutation_reader&, mutation_writer::timestamp_based_splitting_mutation_writer&)#1}::operator()(flat_mutation_reader&, mutation_writer::timestamp_based_splitting_mutation_writer&) const::{lambda()#2}, true> >(seastar::future<void>::finally_body<mutation_writer::feed_writer<mutation_writer::timestamp_based_splitting_mutation_writer>(flat_mutation_reader&&, mutation_writer::timestamp_based_splitting_mutation_writer&&)::{lambda(flat_mutation_reader&, mutation_writer::timestamp_based_splitting_mutation_writer&)#1}::operator()(flat_mutation_reader&, mutation_writer::timestamp_based_splitting_mutation_writer&) const::{lambda()#2}, true>&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, seastar::future<void>::finally_body<mutation_writer::feed_writer<mutation_writer::timestamp_based_splitting_mutation_writer>(flat_mutation_reader&&, auto:1&&)::{lambda(flat_mutation_reader&, mutation_writer::timestamp_based_splitting_mutation_writer&)#1}::operator()(flat_mutation_reader&, mutation_writer::timestamp_based_splitting_mutation_writer&) const::{lambda()#2}, true>&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>#012 --------#012 seastar::internal::do_with_state<std::tuple<flat_mutation_reader, mutation_writer::timestamp_based_splitting_mutation_writer>, seastar::future<void> >#012 --------#012 seastar::(anonymous namespace)::thread_wake_task#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIN8sstables15compaction_infoEEEZNS_5asyncIZNS3_10compaction3runI33noop_compacted_fragments_consumerEENS_6futureIS4_EESt10unique_ptrIS7_St14default_deleteIS7_EET_EUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayISG_E4typeEDpNSK_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSG_DpOSN_EUlvE0_ZNSA_IvE14then_impl_nrvoIS10_SB_EET0_SX_EUlOS5_RS10_ONS_12future_stateINS1_9monostateEEEE_vEE#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<sstables::compaction_info>, seastar::future<sstables::compaction_info>::finally_body<seastar::async<sstables::compaction::run<noop_compacted_fragments_consumer>(std::unique_ptr<sstables::compaction, std::default_delete<sstables::compaction> >, noop_compacted_fragments_consumer)::{lambda()#1}>(seastar::thread_attributes, sstables::compaction::run<noop_compacted_fragments_consumer>(std::unique_ptr<sstables::compaction, std::default_delete<sstables::compaction> >, noop_compacted_fragments_consumer)::{lambda()#1}&&, (std::decay<sstables::compaction::run<noop_compacted_fragments_consumer>(std::unique_ptr<sstables::compaction, std::default_delete<sstables::compaction> >, noop_compacted_fragments_consumer)::{lambda()#1}>::type&&)...)::{lambda()#3}, false>, seastar::future<sstables::compaction_info>::then_wrapped_nrvo<seastar::future<sstables::compaction_info>, {lambda()#3}>({lambda()#3}&&)::{lambda(seastar::internal::promise_base_with_type<sstables::compaction_info>&&, {lambda()#3}&, seastar::future_state<sstables::compaction_info>&&)#1}, sstables::compaction_info>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, table::compact_sstables(sstables::compaction_descriptor)::{lambda(auto:1)#3}, seastar::future<sstables::compaction_info>::then_impl_nrvo<{lambda(auto:1)#3}, table::compact_sstables(sstables::compaction_descriptor)::{lambda(auto:1)#3}<void> >({lambda(auto:1)#3}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(auto:1)#3}&, seastar::future_state<seastar::future>&&)#1}, seastar::future>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, compaction_manager::submit(table*)::{lambda()#1}::operator()()::{lambda()#1}::operator()()::{lambda()#1}::operator()()::{lambda(seastar::future<void>)#2}, {lambda()#1}::then_wrapped_nrvo<{lambda()#1}<seastar::bool_class<seastar::stop_iteration_tag> >, {lambda()#1}>({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >::finally_body<seastar::with_lock<seastar::rwlock_for_read<std::chrono::_V2::steady_clock>, compaction_manager::submit(table*)::{lambda()#1}::operator()()::{lambda()#1}>(seastar::rwlock_for_read<std::chrono::_V2::steady_clock>&, compaction_manager::submit(table*)::{lambda()#1}::operator()()::{lambda()#1}&&)::{lambda()#1}::operator()()::{lambda()#1}, false>, seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >::then_wrapped_nrvo<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, {lambda()#1}>(seastar::rwlock_for_read<std::chrono::_V2::steady_clock>&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, {lambda()#1}&, seastar::future_state<seastar::bool_class<seastar::stop_iteration_tag> >&&)#1}, seastar::bool_class<seastar::stop_iteration_tag> >#012 --------#012 seastar::internal::repeater<compaction_manager::submit(table*)::{lambda()#1}>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<compaction_manager::submit(table*)::{lambda()#2}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, compaction_manager::submit(table*)::{lambda()#2}>(compaction_manager::submit(table*)::{lambda()#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, compaction_manager::submit(table*)::{lambda()#2}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::shared_future<>::shared_state::get_future(std::chrono::time_point<seastar::lowres_clock,
Aborting on shard 1.
Backtrace:
0x0000000002b5b422
0x0000000002b5bab0
0x0000000002b5bd55
0x0000000002b5bda0
0x00007fd68b4d3a8f
/opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4
/opt/scylladb/libreloc/libc.so.6+0x0000000000025894
0x0000000002b2371c
0x0000000001280f4f
0x000000000131490c
0x0000000001344841
0x0000000001345013
0x0000000002df744c
Translated:
void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at reactor.cc:?
seastar::print_with_backtrace(seastar::backtrace_buffer&) at reactor.cc:?
(inlined by) print_with_backtrace at ./build/release/seastar/./seastar/src/core/reactor.cc:752
seastar::print_with_backtrace(char const*) at reactor.cc:?
void seastar::install_oneshot_signal_handler<6, &seastar::sigabrt_action>()::{lambda(int, siginfo_t*, void*)#1}::_FUN(int, siginfo_t*, void*) at reactor.cc:?
(inlined by) operator() at ./build/release/seastar/./seastar/src/core/reactor.cc:3466
(inlined by) _FUN at ./build/release/seastar/./seastar/src/core/reactor.cc:3462
?? ??:0
?? ??:0
?? ??:0
seastar::on_internal_error(seastar::logger&, std::basic_string_view<char, std::char_traits<char> >) at ./build/release/seastar/./seastar/src/core/on_internal_error.cc:39
sstables::mc::writer::consume_end_of_stream() at writer.cc:?
sstables::regular_compaction::stop_sstable_writer(sstables::compaction_writer*) at compaction.cc:?
(inlined by) sstables::regular_compaction::stop_sstable_writer(sstables::compaction_writer*) at ./sstables/compaction.cc:918
sstables::compacting_sstable_writer::consume_end_of_stream() at ./sstables/compaction.cc:787
(inlined by) sstables::compacting_sstable_writer::consume_end_of_stream() at ./sstables/compaction.cc:785
(inlined by) auto compact_mutation_state<(emit_only_live_rows)0, (compact_for_sstables)1>::consume_end_of_stream<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer>(sstables::compacting_sstable_writer&, noop_compacted_fragments_consumer&) at ././mutation_compactor.hh:410
(inlined by) compact_mutation<(emit_only_live_rows)0, (compact_for_sstables)1, sstables::compacting_sstable_writer, noop_compacted_fragments_consumer>::consume_end_of_stream() at ././mutation_compactor.hh:521
(inlined by) stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >::consume_end_of_stream() at ././mutation_reader.hh:352
(inlined by) auto flat_mutation_reader::impl::consume_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::filter>(stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::filter, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ././flat_mutation_reader.hh:274
(inlined by) auto flat_mutation_reader::consume_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::filter>(stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::filter, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ././flat_mutation_reader.hh:383
(inlined by) seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}::operator()() at ./sstables/compaction.cc:612
void std::__invoke_impl<void, seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(std::__invoke_other, seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&) at /usr/include/c++/10/bits/invoke.h:60
(inlined by) std::__invoke_result<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>::type std::__invoke<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, (seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&)...) at /usr/include/c++/10/bits/invoke.h:95
(inlined by) decltype(auto) std::__apply_impl<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}, std::tuple<>>(seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, std::tuple<>&&, std::integer_sequence<unsigned long>) at /usr/include/c++/10/tuple:1723
(inlined by) decltype(auto) std::apply<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}, std::tuple<> >(seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, std::tuple<>&&) at /usr/include/c++/10/tuple:1734
(inlined by) seastar::future<void> seastar::futurize<void>::apply<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, std::tuple<>&&) at ././seastar/include/seastar/core/future.hh:2099
(inlined by) seastar::futurize<std::result_of<std::decay<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>::type ()>::type>::type seastar::async<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(seastar::thread_attributes, seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, (std::decay<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>&&)...)::{lambda()#1}::operator()() const at ././seastar/include/seastar/core/thread.hh:258
(inlined by) seastar::noncopyable_function<void ()>::direct_vtable_for<seastar::futurize<std::result_of<std::decay<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>::type ()>::type>::type seastar::async<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>(seastar::thread_attributes, seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}&&, (std::decay<seastar::future<void> sstables::compaction::setup<noop_compacted_fragments_consumer>(noop_compacted_fragments_consumer)::{lambda(flat_mutation_reader)#1}::operator()(flat_mutation_reader)::{lambda()#1}>&&)...)::{lambda()#1}>::call(seastar::noncopyable_function<void ()> const*) at ././seastar/include/seastar/util/noncopyable_function.hh:116
seastar::noncopyable_function<void ()>::operator()() const at /usr/include/c++/10/bits/basic_string.h:323
(inlined by) seastar::thread_context::main() at ./build/release/seastar/./seastar/src/core/thread.cc:297
Logs:
| Log links for testrun with test id 651b1690-ef16-4d43-8e82-b8071b6ad8ec |
+-----------------+-------------+------------------------------------------------------------------------------------------------------------------------------+
| Date | Log type | Link |
+-----------------+-------------+------------------------------------------------------------------------------------------------------------------------------+
| 20201112_144133 | db-cluster | https://cloudius-jenkins-test.s3.amazonaws.com/651b1690-ef16-4d43-8e82-b8071b6ad8ec/20201112_144133/db-cluster-651b1690.zip |
| 20201112_144133 | loader-set | https://cloudius-jenkins-test.s3.amazonaws.com/651b1690-ef16-4d43-8e82-b8071b6ad8ec/20201112_144133/loader-set-651b1690.zip |
| 20201112_144133 | monitor-set | https://cloudius-jenkins-test.s3.amazonaws.com/651b1690-ef16-4d43-8e82-b8071b6ad8ec/20201112_144133/monitor-set-651b1690.zip |
| 20201112_144133 | sct-runner | https://cloudius-jenkins-test.s3.amazonaws.com/651b1690-ef16-4d43-8e82-b8071b6ad8ec/20201112_144133/sct-runner-651b1690.zip |
+-----------------+-------------+------------------------------------------------------------------------------------------------------------------------------+
That seems to be a problem related to TWCS interposer consumer
branch 4-3.rc1 (HEAD a8e372bf9) is missing https://github.com/scylladb/scylla/commit/f5323b29d9bbe79afee3bc556a8a0b741ad6fa4b, and from my analysis, it will fix the problem just reported
Backported.
Installation details
Scylla version (or git commit hash): 4.3.rc1-0.20201110.a8e372bf9 with build-id f868ccb821d87eb55b943ad771349ac58df19ad6
Cluster size: 3
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0d14a169a13ba0868 (eu-west-1)
Scenarios: gemini-3h-cdc-preimage-write-test
An insert query failed because one of the nodes shut down due to a similar error:
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] storage_service - Shutting down communications due to I/O errors until operator intervention
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] storage_service - Commitlog error: std::system_error (error system:28, No space left on device)
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] storage_service - Stop transport: starts
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] storage_service - Shutting down native transport
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 1] commitlog - Exception in segment reservation: storage_io_error (Storage I/O error: 28: No space left on device)
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] commitlog - Exception in segment reservation: storage_io_error (Storage I/O error: 28: No space left on device)
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 1] cql_server - exception while processing connection: seastar::nested_exception: std::system_error (error system:32,
sendmsg: Broken pipe) (while cleaning up after std::system_error (error system:32, sendmsg: Broken pipe))
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] cql_server - exception while processing connection: seastar::nested_exception: std::system_error (error system:32,
sendmsg: Broken pipe) (while cleaning up after std::system_error (error system:32, sendmsg: Broken pipe))
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] cql_server_controller - CQL server stopped
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] storage_service - Shutting down native transport was successful
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] storage_service - Shutting down rpc server
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] thrift_controller - Thrift server stopped
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] storage_service - Shutting down rpc server was successful
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] storage_service - Stop transport: shutdown rpc and cql server done
Nov 12 15:13:40 gemini-cdc-preimage-write-4-3-db-node-d39a015f-1 scylla[9783]: [shard 0] gossip - My status = NORMAL
However, this time there was no backtrace to explain his error, so I'm too sure it's related
This does not look similar. There is no crash.
4.3 includes b2271800a553a9426cf12b8018263735cd6a16f1 - so if this still happens on 4.3 we need updated logs
@aleksbykov / @roydahan ping
Job: /longevity-cdc-100gb-4h-test
TestID: a0d0136e-65fa-4d27-b051-f7cdebc570a0
JobLink: https://jenkins.scylladb.com/job/scylla-4.3/job/longevity/job/longevity-cdc-100gb-4h-test/16/
Scylla version: 4.3.rc2-0.20201124.bc922a743
All db logs link: https://cloudius-jenkins-test.s3.amazonaws.com/a0d0136e-65fa-4d27-b051-f7cdebc570a0/20201125_013322/db-cluster-a0d0136e.zip
Monitoring data link: https://cloudius-jenkins-test.s3.amazonaws.com/a0d0136e-65fa-4d27-b051-f7cdebc570a0/20201125_013322/monitor-set-a0d0136e.zip
Node with error Node6 log: system.log.tar.gz
Issue was reproduced twice on same node during RebuildStreamingErr nemesis. During this nemesis. scylla is stopped, Several sstables are removed, then scyllla is starting and nodetool rebuild command is executed. While rebuild is running db node Instance is going to softly reboot. after db instance is up, nodetool rebuild is executed again.
First time nemesis started at
2020-11-24T23:40:27+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !NOTICE | sudo: scyllaadm : TTY=unknown ; PWD=/home/scyllaadm ; USER=root ; COMMAND=/bin/systemctl stop scylla-server.service
Scylla was stopped successfully.
After that several sstables were removed and scylla was started
2020-11-24T23:42:46+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] thrift_controller - Thrift server listening on 10.0.2.115:9160 ...
2020-11-24T23:42:46+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] init - serving
2020-11-24T23:42:46+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] init - Scylla version 4.3.rc2-0.20201124.bc922a743 initialization completed.
and rebuild was started:
2020-11-24T23:43:29+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] storage_service - rebuild from dc: (any dc)
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] range_streamer - Rebuild starts, nr_ranges_remaining=4412
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] range_streamer - Rebuild with [10.0.2.1, 10.0.3.192, 10.0.0.244, 10.0.1.81, 10.0.3.105] for keyspace=system_distributed started, nodes_to_stream=5
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] range_streamer - Rebuild with 10.0.2.1 for keyspace=system_distributed, streaming [0, 15) out of 158 ranges
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] stream_session - [Stream #db7ea770-2eae-11eb-9d2e-000000000000] Executing streaming plan for Rebuild-system_distributed-index-0 with peers={10.0.2.1}, master
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] range_streamer - Rebuild with 10.0.3.192 for keyspace=system_distributed, streaming [0, 15) out of 155 ranges
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] stream_session - [Stream #db7ece80-2eae-11eb-9d2e-000000000000] Executing streaming plan for Rebuild-system_distributed-index-0 with peers={10.0.3.192}, master
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] range_streamer - Rebuild with 10.0.0.244 for keyspace=system_distributed, streaming [0, 15) out of 151 ranges
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] stream_session - [Stream #db7ece81-2eae-11eb-9d2e-000000000000] Executing streaming plan for Rebuild-system_distributed-index-0 with peers={10.0.0.244}, master
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] range_streamer - Rebuild with 10.0.1.81 for keyspace=system_distributed, streaming [0, 15) out of 152 ranges
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] stream_session - [Stream #db7ece82-2eae-11eb-9d2e-000000000000] Executing streaming plan for Rebuild-system_distributed-index-0 with peers={10.0.1.81}, master
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] range_streamer - Rebuild with 10.0.3.105 for keyspace=system_distributed, streaming [0, 17) out of 171 ranges
2020-11-24T23:43:30+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] stream_session - [Stream #db7ece83-2eae-11eb-9d2e-000000000000] Executing streaming plan for Rebuild-system_distributed-index-0 with peers={10.0.3.105}, master
During rebuild, instances was rebooted:
2020-11-24T23:45:04+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | systemd: Received SIGINT.
2020-11-24T23:45:04+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | systemd: Stopped Hardware RNG Entropy Gatherer Wake threshold service.
2020-11-24T23:45:04+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | systemd: Stopping Hardware RNG Entropy Gatherer Wake threshold service...
2020-11-24T23:45:04+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | systemd: Stopping Session 2 of user scyllaadm.
2
And during scylla had been stopping error and coredump detected:
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !ERR | scylla: [shard 8] compaction_manager - compaction failed: seastar::nested_exception: sstables::compaction_stop_exception (Compaction for cdc_test/test_table_preimage_postimage_scylla_cdc_log was stopped due to: shutdown) (while cleaning up after seastar::broken_promise (broken promise)): stopping
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 8] compaction_manager - Stopped
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 4] compaction - [Compact cdc_test.test_table_preimage_scylla_cdc_log 0aeb2830-2eaf-11eb-bdfa-000000000002] Compacted 7 sstables to [/var/lib/scylla/data/cdc_test/test_table_preimage_scylla_cdc_log-a4280a122e9e11ebbdb9000000000006/md-30384-big-Data.db:level=0, ]. 1MB to 1MB (~68% of original) in 14800ms = 75kB/s. ~896 total partitions merged to 202.
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !ERR | scylla: [shard 11] compaction_manager - compaction failed: seastar::nested_exception: sstables::compaction_stop_exception (Compaction for cdc_test/test_table_preimage_postimage_scylla_cdc_log was stopped due to: shutdown) (while cleaning up after seastar::broken_promise (broken promise)): stopping
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: [shard 0] rpc - client 10.0.2.1:61446: unexpected eof on a stream while reading data: expected 1063 got 47
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !ERR | scylla: [shard 0] flat_mutation_reader - [validator 0x60000159ff10 for sstable writer /var/lib/scylla/data/cdc_test/test_table_postimage_scylla_cdc_log-bf0c31d02e9e11ebae08000000000007/md-29792-big-Data.db (cdc_test.test_table_postimage_scylla_cdc_log bf0c31d0-2e9e-11eb-ae08-000000000007)] Stream ended with unclosed partition: clustering row, at: 0x2fbb27d#012 0x2fbb4d0#012 0x2fbb939#012 0x2b26b6f#012 0x10fd233#012 0x11225f2#012 0x11d57b4#012 0x11d6143#012 0x2dfa8bc#012 --------#012 N7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_5asyncIZN8sstables7sstable16write_componentsE20flat_mutation_readermNS_13lw_shared_ptrIK6schemaEERKNS5_21sstable_writer_configE14encoding_statsRKNS_17io_priority_classEEUlvE_JEEENS_8futurizeINSt9result_ofIFNSt5decayIT_E4typeEDpNSM_IT0_E4typeEEE4typeEE4typeENS_17thread_attributesEOSN_DpOSQ_EUlvE0_ZNS_6futureIvE14then_impl_nrvoIS13_S15_EET0_S10_EUlOS3_RS13_ONS_12future_stateINS1_9monostateEEEE_vEE#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<seastar::async<sstables::sstable::write_components(flat_mutation_reader, unsigned long, seastar::lw_shared_ptr<schema const>, sstables::sstable_writer_config const&, encoding_stats, seastar::io_priority_class const&)::{lambda()#1}>(seastar::thread_attributes, std::decay&&, (std::decay<sstables::sstable::write_components(flat_mutation_reader, unsigned long, seastar::lw_shared_ptr<schema const>, sstables::sstable_writer_config const&, encoding_stats, seastar::io_priority_class const&)::{lambda()#1}>::type&&)...)::{lambda()#3}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, {lambda()#3}>({lambda()#3}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda()#3}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, seastar::future<void>::finally_body<sstables::sstable::write_components(flat_mutation_reader, unsigned long, seastar::lw_shared_ptr<schema const>, sstables::sstable_writer_config const&, encoding_stats, seastar::io_priority_class const&)::{lambda()#2}, false>, seastar::future<void>::then_wrapped_nrvo<seastar::future<void>, sstables::sstable::write_components(flat_mutation_reader, unsigned long, seastar::lw_shared_ptr<schema const>, sstables::sstable_writer_config const&, encoding_stats, seastar::io_priority_class const&)::{lambda()#2}>(sstables::sstable::write_components(flat_mutation_reader, unsigned long, seastar::lw_shared_ptr<schema const>, sstables::sstable_writer_config const&, encoding_stats, seastar::io_priority_class const&)::{lambda()#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, sstables::sstable::write_components(flat_mutation_reader, unsigned long, seastar::lw_shared_ptr<schema const>, sstables::sstable_writer_config const&, encoding_stats, seastar::io_priority_class const&)::{lambda()#2}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, streaming::stream_session::init_messaging_service_handler(netw::messaging_service&)::{lambda(seastar::rpc::client_info const&, utils::UUID, utils::UUID, utils::UUID, unsigned long, seastar::rpc::optional<streaming::stream_reason>, seastar::rpc::source<frozen_mutation_fragment, seastar::rpc::optional<streaming::stream_mutation_fragments_cmd> >)#3}::operator()(seastar::rpc::client_info const&, utils::UUID, utils::UUID, utils::UUID, unsigned long, seastar::rpc::optional<streaming::stream_reason>, seastar::rpc::source<frozen_mutation_fragment, seastar::rpc::optional<streaming::stream_mutation_fragments_cmd> >) const::{lambda(seastar::lw_shared_ptr<schema const>)#1}::operator()(schema const)::{lambda(flat_mutation_reader)#2}::operator()({lambda(seastar::lw_shared_ptr<schema const>)#1}) const::{lambda(bool)#1}::operator()(bool)::{lambda({lambda(seastar::lw_shared_ptr<schema const>)#1})#1}::operator()({lambda(seastar::lw_shared_ptr<schema const>)#1}) const::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda(flat_mutation_reader)#2}, {lambda(bool)#1}>({lambda(flat_mutation_reader)#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(flat_mutation_reader)#2}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, streaming::stream_session::init_messaging_service_handler(netw::messaging_service&)::{lambda(seastar::rpc::client_info const&, utils::UUID, utils::UUID, utils::UUID, unsigned long, seastar::rpc::optional<streaming::stream_reason>, seastar::rpc::source<frozen_mutation_fragment, seastar::rpc::optional<streaming::stream_mutation_fragments_cmd> >)#3}::operator()(seastar::rpc::client_info const&, utils::UUID, utils::UUID, utils::UUID, unsigned long, seastar::rpc::optional<streaming::stream_reason>, seastar::rpc::source<frozen_mutation_fragment, seastar::rpc::optional<streaming::stream_mutation_fragments_cmd> >) const::{lambda(seastar::lw_shared_ptr<schema const>)#1}::operator()(schema const)::{lambda(flat_mutation_reader)#2}::operator()({lambda(seastar::lw_shared_ptr<schema const>)#1}) const::{lambda(bool)#1}::operator()(bool)::{lambda({lambda(seastar::lw_shared_ptr<schema const>)#1})#1}::operator()({lambda(seastar::lw_shared_ptr<schema const>)#1}) const::{lambda()#2}, seastar::future<void>::then_impl_nrvo<{lambda(flat_mutation_reader)#2}, {lambda(bool)#1}>({lambda(flat_mutation_reader)#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(flat_mutation_reader)#2}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>#012 --------#012 seastar::continuation<seastar::internal::promise_base_with_type<void>, streaming::stream_session::init_messaging_service_handler(netw::messaging_service&)::{lambda(seastar::rpc::client_info const&, utils::UUID, utils::UUID, utils::UUID, unsigned long, seastar::rpc::optional<streaming::stream_reason>, seastar::rpc::source<frozen_mutation_fragment, seastar::rpc::optional<streaming::stream_mutation_fragments_cmd> >)#3}::operator()(seastar::rpc::client_info const&, utils::UUID, utils::UUID, utils::UUID, unsigned long, seastar::rpc::optional<streaming::stream_reason>, seastar::rpc::source<frozen_mutation_fragment, seastar::rpc::optional<streaming::stream_mutation_fragments_cmd> >) const::{lambda(seastar::lw_shared_ptr<schema const>)#1}::operator()(schema const)::{lambda(flat_mutation_reader)#2}::operator()({lambda(seastar::lw_shared_ptr<schema const>)#1}) const::{lambda(bool)#1}::operator()(bool)::{lambda({lambda(seastar::lw_shared_ptr<schema const>)#1})#1}::operator()({lambda(seastar::lw_shared_ptr<schema const>)#1}) const::{lambda()#3}, seastar::future<void>::then_impl_nrvo<{lambda(flat_mutation_reader)#2}, {lambda(bool)#1}>({lambda(flat_mutation_reader)#2}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(flat_mutation_reader)#2}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: Aborting on shard 0.
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: Backtrace:
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x0000000002b5e892
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x0000000002b5ef20
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x0000000002b5f1c5
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x0000000002b5f210
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x00007fe6f3a1da8f
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: /opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000025894
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x0000000002b26b8c
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x00000000010fd233
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x00000000011225f2
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x00000000011d57b4
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x00000000011d6143
2020-11-24T23:45:05+00:00 longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 !INFO | scylla: 0x0000000002dfa8bc
Coredump info:
2020-11-25 00:02:22.551: (CoreDumpEvent Severity.ERROR)node=Node longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 [13.49.74.250 | 10.0.2.115] (seed: False)
corefile_url=https://storage.cloud.google.com/upload.scylladb.com/core.scylla.995.17f5f40656c740d083669e051486d880.82972.1606261505000000/core.scylla.995.17f5f40656c740d083669e051486d880.82972.1606261505000000.gz
backtrace= PID: 82972 (scylla)
UID: 995 (scylla)
GID: 1001 (scylla)
Signal: 6 (ABRT)
Timestamp: Tue 2020-11-24 23:45:05 UTC (4min 2s ago)
Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --experimental-features cdc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 1-7,9-15 --lock-memory=1
Executable: /opt/scylladb/libexec/scylla
Control Group: /scylla.slice/scylla-server.slice/scylla-server.service
Unit: scylla-server.service
Slice: scylla-server.slice
Boot ID: 17f5f40656c740d083669e051486d880
Machine ID: ec27041a88f931b46d484ac723cf13ee
Hostname: longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6
Coredump: /var/lib/systemd/coredump/core.scylla.995.17f5f40656c740d083669e051486d880.82972.1606261505000000
Message: Process 82972 (scylla) of user 995 dumped core.
Stack trace of thread 82972:
#0 0x00007fe6f2cc99e5 raise (libc.so.6)
#1 0x00007fe6f2cb294d abort (libc.so.6)
#2 0x0000000002b26b8d _ZN7seastar17on_internal_errorERNS_6loggerESt17basic_string_viewIcSt11char_traitsIcEE (scylla)
#3 0x00000000010fd234 on_validation_error (scylla)
#4 0x00000000011225f3 _ZN42mutation_fragment_stream_validating_filter16on_end_of_streamEv (scylla)
#5 0x00000000011d57b5 _ZZN8sstables7sstable16write_componentsE20flat_mutation_readermN7seastar13lw_shared_ptrIK6schemaEERKNS_21sstable_writer_configE14encoding_statsRKNS2_17io_priority_classEENUlvE_clEv (scylla)
#6 0x00000000011d6144 __invoke_impl<void, sstables::sstable::write_components(flat_mutation_reader, uint64_t, sstables::schema_ptr, const sstables::sstable_writer_config&, encoding_stats, const seastar::io_priority_class&)::<lambda()> > (scylla)
#7 0x0000000002dfa8bd _ZNK7seastar20noncopyable_functionIFvvEEclEv (scylla)
Stack trace of thread 82988:
#0 0x00007fe6f3a1c9ac read (libpthread.so.0)
#1 0x0000000002dcd357 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2 0x0000000002dcd5c8 operator() (scylla)
#3 0x0000000002b26bfe _ZNKSt8functionIFvvEEclEv (scylla)
#4 0x00007fe6f3a12432 start_thread (libpthread.so.0)
#5 0x00007fe6f2d8e913 __clone (libc.so.6)
Stack trace of thread 82976:
#0 0x0000000000e90d1f _ZNK3imr5utils6objectINS_9structureIJNS_6memberIN4data4cell4tags5flagsENS_5flagsIJNS6_10collectionENS6_4liveENS6_8expiringENS6_14counter_updateENS6_5emptyENS6_13external_dataEEEEEENS3_INS6_4cellENS_7variantISH_JNS3_INS6_11atomic_cellENS2_IJNS3_INS6_9timestampENS_3podIlEEEENS3_ISB_NS_8optionalISB_NS2_IJNS3_INS6_3ttlENSL_IiEEEENS3_INS6_6expiryESM_EEEEEEEEENS3_INS6_5valueENSI_ISX_JNS3_INS6_4deadESM_EENS3_ISC_SM_EENS3_INS6_11fixed_valueENS_6bufferIS11_EEEENS3_INS6_14variable_valueENS2_IJNS3_INS6_10value_sizeENSL_IjEEEENS3_INS6_10value_dataENSI_IS19_JNS3_INS6_7pointerENS_11tagged_typeIS1A_NSL_IPhEEEEEENS3_INS6_4dataENS12_IS1G_EEEEEEEEEEEEEEEEEEEEEEEENS3_IS9_S1L_EEEEEEEEEEE3getEv (scylla)
#1 0x000000000102ac2a _ZN3rowC2ERK6schema11column_kindRKS_ (scylla)
#2 0x0000000000fb98eb _ZZN30partition_snapshot_flat_readerI28partition_snapshot_accounterE20lsa_partition_reader8next_rowERK20nonwrapping_intervalI21clustering_key_prefixERKSt8optionalI21position_in_partitionER22range_tombstone_streamENKUlvE_clEv (scylla)
#3 0x0000000000fbba98 _ZZZN30partition_snapshot_flat_readerI28partition_snapshot_accounterE20lsa_partition_reader16in_alloc_sectionIZNS2_8next_rowERK20nonwrapping_intervalI21clustering_key_prefixERKSt8optionalI21position_in_partitionER22range_tombstone_streamEUlvE_EEDcOT_ENKUlvE_clEvENKUlvE_clEv (scylla)
#4 0x0000000000fad2e4 _ZZN12flush_reader11fill_bufferENSt6chrono10time_pointIN7seastar12lowres_clockENS0_8durationIlSt5ratioILl1ELl1000EEEEEEENKUlvE0_clEv (scylla)
#5 0x0000000000fad766 _ZN7seastar8futurizeINS_6futureIvEEE6invokeIRZN12flush_reader11fill_bufferENSt6chrono10time_pointINS_12lowres_clockENS6_8durationIlSt5ratioILl1ELl1000EEEEEEEUlvE0_JEEES2_OT_DpOT0_ (scylla)
#6 0x0000000001170ba3 _ZN20flat_mutation_reader4impl26consume_pausable_in_threadISt17reference_wrapperINS0_16consumer_adapterIN8sstables14sstable_writerEEEE42mutation_fragment_stream_validating_filterEEvT_T0_NSt6chrono10time_pointIN7seastar12lowres_clockENSB_8durationIlSt5ratioILl1ELl1000EEEEEE (scylla)
#7 0x00000000011d57b5 _ZZN8sstables7sstable16write_componentsE20flat_mutation_readermN7seastar13lw_shared_ptrIK6schemaEERKNS_21sstable_writer_configE14encoding_statsRKNS2_17io_priority_classEENUlvE_clEv (scylla)
#8 0x00000000011d6144 __invoke_impl<void, sstables::sstable::write_components(flat_mutation_reader, uint64_t, sstables::schema_ptr, const sstables::sstable_writer_config&, encoding_stats, const seastar::io_priority_class&)::<lambda()> > (scylla)
#9 0x0000000002dfa8bd _ZNK7seastar20noncopyable_functionIFvvEEclEv (scylla)
Stack trace of thread 82973:
#0 0x0000000002b0e454 _ZN7seastar6memory10small_pool8allocateEv (scylla)
#1 0x000000000134e064 _ZN9__gnu_cxx13new_allocato
download_instructions=gsutil cp gs://upload.scylladb.com/core.scylla.995.17f5f40656c740d083669e051486d880.82972.1606261505000000/core.scylla.995.17f5f40656c740d083669e051486d880.82972.1606261505000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.995.17f5f40656c740d083669e051486d880.82972.1606261505000000.gz
In 40 minutes same nemesis on same node ran again. and same issue repeated.
Coredump:
2020-11-25 01:11:08.251: (CoreDumpEvent Severity.ERROR)node=Node longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6 [13.49.74.250 | 10.0.2.115] (seed: False)
corefile_url=https://storage.cloud.google.com/upload.scylladb.com/core.scylla.995.6d3ae4176cec4209b86b95fcbb9817ac.35636.1606265799000000/core.scylla.995.6d3ae4176cec4209b86b95fcbb9817ac.35636.1606265799000000.gz
backtrace= PID: 35636 (scylla)
UID: 995 (scylla)
GID: 1001 (scylla)
Signal: 6 (ABRT)
Timestamp: Wed 2020-11-25 00:56:39 UTC (4min 7s ago)
Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --experimental-features cdc --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 1-7,9-15 --lock-memory=1
Executable: /opt/scylladb/libexec/scylla
Control Group: /
Boot ID: 6d3ae4176cec4209b86b95fcbb9817ac
Machine ID: ec27041a88f931b46d484ac723cf13ee
Hostname: longevity-cdc-100gb-4h-4-3-db-node-a0d0136e-6
Coredump: /var/lib/systemd/coredump/core.scylla.995.6d3ae4176cec4209b86b95fcbb9817ac.35636.1606265799000000
Message: Process 35636 (scylla) of user 995 dumped core.
Stack trace of thread 35641:
#0 0x00007f93750699e5 raise (libc.so.6)
#1 0x00007f937505294d abort (libc.so.6)
#2 0x0000000002b26b8d _ZN7seastar17on_internal_errorERNS_6loggerESt17basic_string_viewIcSt11char_traitsIcEE (scylla)
#3 0x0000000001281830 _ZN8sstables2mc6writer21consume_end_of_streamEv (scylla)
#4 0x0000000001316e4d _ZN8sstables10compaction18finish_new_sstableEPNS_17compaction_writerE (scylla)
#5 0x000000000134577c _ZN8sstables25compacting_sstable_writer21consume_end_of_streamEv (scylla)
#6 0x0000000002dfa8bd _ZNK7seastar20noncopyable_functionIFvvEEclEv (scylla)
Stack trace of thread 35642:
#0 0x000000000137355b _ZNSt8_Rb_treeImSt4pairIKmSt6vectorIN7seastar13lw_shared_ptrIN8sstables7sstableEEESaIS7_EEESt10_Select1stISA_ESt4lessImESaISA_EE12_M_erase_auxESt23_Rb_tree_const_iteratorISA_E (scylla)
#1 0x0000000001374369 _ZN8sstables31size_tiered_compaction_strategy29estimated_pending_compactionsERKSt6vectorIN7seastar13lw_shared_ptrINS_7sstableEEESaIS5_EEiiNS_39size_tiered_compaction_strategy_optionsE (scylla)
#2 0x00000000013880fd _ZN8sstables31time_window_compaction_strategy36update_estimated_compaction_by_tasksERSt3mapIlSt6vectorIN7seastar13lw_shared_ptrINS_7sstableEEESaIS6_EESt4lessIlESaISt4pairIKlS8_EEEii (scylla)
#3 0x000000000138c28e _ZN8sstables31time_window_compaction_strategy25get_compaction_candidatesER5tableSt6vectorIN7seastar13lw_shared_ptrINS_7sstableEEESaIS7_EE (scylla)
#4 0x000000000138c596 _ZN8sstables31time_window_compaction_strategy29get_next_non_expired_sstablesER5tableSt6vectorIN7seastar13lw_shared_ptrINS_7sstableEEESaIS7_EENSt6chrono10time_pointI8gc_clockNSA_8durationIlSt5ratioILl1ELl1EEEEEE (scylla)
#5 0x000000000138cdfa _ZN8sstables31time_window_compaction_strategy27get_sstables_for_compactionER5tableSt6vectorIN7seastar13lw_shared_ptrINS_7sstableEEESaIS7_EE (scylla)
#6 0x0000000001354d7e _ZN8sstables19compaction_strategy27get_sstables_for_compactionER5tableSt6vectorIN7seastar13lw_shared_ptrINS_7sstableEEESaIS7_EE (scylla)
#7 0x00000000013b0710 _ZZZZN18compaction_manager6submitEP5tableENUlvE_clEvENUlvE_clEvENUlvE_clEv (scylla)
#8 0x00000000013b586d __invoke_impl<seastar::future<seastar::bool_class<seastar::stop_iteration_tag> >, compaction_manager::submit(column_family*)::<lambda()> mutable::<lambda()> mutable::<lambda()>&> (scylla)
#9 0x0000000002b5ccd8 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
#10 0x0000000002b5cf8f _ZN7seastar7reactor14run_some_tasksEv (scylla)
#11 0x0000000002b9c876 _ZN7seastar7reactor14run_some_tasksEv (scylla)
#12 0x0000000002bae02b _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
#13 0x0000000002b26bfe _ZNKSt8functionIFvvEEclEv (scylla)
#14 0x00007f9375db2432 start_thread (libpthread.so.0)
#15 0x00007f937512e913 __clone (libc.so.6)
Stack trace of thread 35645:
#0 0x0000000001371bb5 _ZNKSt8__detail10_Synth3wayclIPSt4pairIN7seastar13lw_shared_ptrIN8sstables7sstableEEEmES9_EEDaRKT_RKT0_ (scylla)
#1 0x0000000001373118 __sort<__gnu_cxx::__normal_iterator<std::pair<seastar::lw_shared_ptr<sstables::sstable>, long unsigned int>*, std::vector<std::pair<seastar::lw_shared_ptr<sstables::sstable>, long unsigned int> > >, __gnu_cxx::__ops::_Iter_comp_iter<sstables::size_tiered_compaction_strategy::get_buckets(const std::vector<seastar::lw_shared_ptr<sstables::sstable> >&, sstables::size_tiered_compaction_strategy_options)::<lambda(auto:180&, auto:181&)> > > (scylla)
#2 0x0000000001374369 _ZN8sstables31size_tiered_compaction_strategy29estimated_pending_compactionsERKSt6vectorIN7seastar13lw_shared_ptrINS_7sstableEEESaIS5_EEiiNS_39size_tiered_compaction_strategy_optionsE (scylla)
#3 0x00000000013880fd _ZN8sstables31time_window_compaction_strategy36update_estimated_compaction_by_tasksERSt3mapIlSt6vectorIN7seastar13lw_shared_ptrINS_7sstableEEESaIS6_EESt4lessIlESaISt4pairIKlS8_EEEii (scylla)
#4 0x000000000138c28e _ZN8sstables31time_window_compaction_strategy25get_compaction_candidatesER5tableSt6vectorIN7seastar13lw_shared_ptrINS_7sstab
download_instructions=gsutil cp gs://upload.scylladb.com/core.scylla.995.6d3ae4176cec4209b86b95fcbb9817ac.35636.1606265799000000/core.scylla.995.6d3ae4176cec4209b86b95fcbb9817ac.35636.1606265799000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.995.6d3ae4176cec4209b86b95fcbb9817ac.35636.1606265799000000.gz
@denesb please look into this.
4.3.rc2-0.20201124.bc922a743 contains b2271800a553a9426cf12b8018263735cd6a16f1 which is the backport for your fix to #7411
@gleb-cloudius - can you please help here
I failed to open the core even in a docker:
warning: Error reading shared library list entry at 0x500000007
I managed to open the core dump with http://downloads.scylladb.com/relocatable/unstable/branch-4.3/2020-11-24T14:24:49Z/scylla-package.tar.gz in the dbuild docker image.
though backtrac edoes hit a gdb internal error:
(gdb) bt
#0 0x00007f93750699e5 in raise () from /data/7482/scylla/libreloc/libc.so.6
#1 0x00007f937505294d in abort () from /data/7482/scylla/libreloc/libc.so.6
#2 0x0000000002b26b8d in seastar::on_internal_error (logger=..., msg=...) at ./seastar/src/core/on_internal_error.cc:39
#3 0x0000000001281830 in sstables::mc::writer::consume_end_of_stream (this=<optimized out>) at /usr/include/c++/10/bits/char_traits.h:357
#4 0x0000000001316e4d in sstables::compaction::finish_new_sstable (writer=0x605000768750, this=0x6050094ea800) at sstables/compaction.cc:920
#5 sstables::regular_compaction::stop_sstable_writer (this=0x6050094ea800, writer=0x605000768750) at sstables/compaction.cc:920
#6 0x000000000134577c in sstables::compacting_sstable_writer::consume_end_of_stream (this=0x605000768748) at /usr/include/c++/10/optional:903
#7 sstables::compacting_sstable_writer::consume_end_of_stream (this=0x605000768748) at sstables/compaction.cc:787
#8 compact_mutation_state<(emit_only_live_rows)0, (compact_for_sstables)1>::consume_end_of_stream<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> (gc_consumer=..., consumer=..., this=<optimized out>) at ./mutation_compactor.hh:410
#9 compact_mutation<(emit_only_live_rows)0, (compact_for_sstables)1, sstables::compacting_sstable_writer, noop_compacted_fragments_consumer>::consume_end_of_stream (this=0x605000768740) at ./mutation_compactor.hh:521
#10 stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >::consume_end_of_stream (this=0x60500199fe90) at ./mutation_reader.hh:352
#11 flat_mutation_reader::impl::consume_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::no_filter> (filter=..., timeout=..., consumer=..., this=<optimized out>)
at ./flat_mutation_reader.hh:274
../../gdb/inline-frame.c:159: internal-error: void inline_frame_this_id(frame_info*, void**, frame_id*): Assertion `!frame_id_eq (*this_id, outer_frame_id)' failed.
A problem internal to GDB has been detected,
@gleb-cloudius would you like to use my setup on aws?
http://downloads.scylladb.com/relocatable/unstable/branch-4.3/2020-11-24T14:24:49Z/
timestamp: 2020-11-24T14:24:49Z
[email protected]:scylladb/scylla.git-sha: bc922a743f72170d4770a98308b2ef7709b4bcd7
jenkins-job-path: scylla-4.3/build
jenkins-job-number: 12
url-id: branch-4.3/2020-11-24T14:24:49Z
reloc-pack-url: downloads.scylladb.com/relocatable/unstable/branch-4.3/2020-11-24T14:24:49Z/
scylla-product: scylla
scylla-release: 0.20201124.bc922a743
scylla-version: 4.3.rc2
scylla-BuildID[sha1]: 008ded438dd1adb97d8b4c327c4628ce834086c1
scylla-debug-BuildID[sha1]: 7f4f1fd94abf8ffa08336bd69bfd40657aaa8b14
On Sun, Dec 06, 2020 at 03:30:28AM -0800, Benny Halevy wrote:
I managed to open the core dump with http://downloads.scylladb.com/relocatable/unstable/branch-4.3/2020-11-24T14:24:49Z/scylla-package.tar.gz in the dbuild docker image.
That is what I an doing except that I am looking at different core. The
one without compaction in it trace.
though backtrac edoes hit a gdb internal error:
(gdb) bt #0 0x00007f93750699e5 in raise () from /data/7482/scylla/libreloc/libc.so.6 #1 0x00007f937505294d in abort () from /data/7482/scylla/libreloc/libc.so.6 #2 0x0000000002b26b8d in seastar::on_internal_error (logger=..., msg=...) at ./seastar/src/core/on_internal_error.cc:39 #3 0x0000000001281830 in sstables::mc::writer::consume_end_of_stream (this=<optimized out>) at /usr/include/c++/10/bits/char_traits.h:357 #4 0x0000000001316e4d in sstables::compaction::finish_new_sstable (writer=0x605000768750, this=0x6050094ea800) at sstables/compaction.cc:920 #5 sstables::regular_compaction::stop_sstable_writer (this=0x6050094ea800, writer=0x605000768750) at sstables/compaction.cc:920 #6 0x000000000134577c in sstables::compacting_sstable_writer::consume_end_of_stream (this=0x605000768748) at /usr/include/c++/10/optional:903 #7 sstables::compacting_sstable_writer::consume_end_of_stream (this=0x605000768748) at sstables/compaction.cc:787 #8 compact_mutation_state<(emit_only_live_rows)0, (compact_for_sstables)1>::consume_end_of_stream<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> (gc_consumer=..., consumer=..., this=<optimized out>) at ./mutation_compactor.hh:410 #9 compact_mutation<(emit_only_live_rows)0, (compact_for_sstables)1, sstables::compacting_sstable_writer, noop_compacted_fragments_consumer>::consume_end_of_stream (this=0x605000768740) at ./mutation_compactor.hh:521 #10 stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >::consume_end_of_stream (this=0x60500199fe90) at ./mutation_reader.hh:352 #11 flat_mutation_reader::impl::consume_in_thread<stable_flattened_mutations_consumer<compact_for_compaction<sstables::compacting_sstable_writer, noop_compacted_fragments_consumer> >, flat_mutation_reader::no_filter> (filter=..., timeout=..., ../../gdb/inline-frame.c:159: internal-error: void inline_frame_this_id(frame_info*, void**, frame_id*): Assertion `!frame_id_eq (*this_id, outer_frame_id)' failed. A problem internal to GDB has been detected,@gleb-cloudius would you like to use my setup on aws?
If gdb crashes it is no much use.
--
Gleb.
On Sun, Dec 06, 2020 at 01:41:44PM +0200, Gleb Natapov wrote:
../../gdb/inline-frame.c:159: internal-error: void inline_frame_this_id(frame_info, void, frame_id): Assertion `!frame_id_eq (*this_id, outer_frame_id)' failed.
A problem internal to GDB has been detected,
I was using wrong build (but made from the same commit hash!). Now I see
the same with my core as well.
--
Gleb.
FWIW, I can navigate through the frames at least up to fr 14 without gdb crashing, as long as I don't issue "bt"
@gleb-cloudius / @bhalevy just pasting what we discussed in the call.
Since there was streaming aborted in this scenario - the issue could be related to an sstable created during streaming that was closed when the stream was aborted and instead of throwing it away the file was used later ...
Please check which sstable is this and when it was created and if its an artifact of streaming creation.
On Sun, Dec 06, 2020 at 04:31:07AM -0800, Shlomi Livne wrote:
Since there was streaming aborted in this scenario - the issue could be related to an sstable created during streaming that was closed when the stream was aborted and instead of throwing it away the file was used later ...
There are two different cores.
First is during stream stopping. Looks like the code somewhere thinks
that the stream got valid EOS instead of ending abruptly. Scylla streaming
has code that suppose to catch that. Don't know what it does not work.
Second is also during stopping of a node but in completely different
place. Now this is compaction. Frankly I have hard time believing in a
coincidence that supposedly corrupted sstable was compacted precisely
when the node was stopped. My guess is that this is similar issue, but
different code: an exception somewhere is wrongfully interpreted as EOS.
--
Gleb.
I managed to open the core dump with http://downloads.scylladb.com/relocatable/unstable/branch-4.3/2020-11-24T14:24:49Z/scylla-package.tar.gz in the dbuild docker image.
though backtrac edoes hit a gdb internal error:
This is https://github.com/scylladb/scylla/blob/master/docs/debugging.md#gdb-crashes-when-priting-the-backtrace-or-some-variable. A long as you don't bt past the problematic frame you are fine.
(Still happens on 4.4.dev-0.20201203.c7311d10806)
The bug is likely here
We transform an exception into end of stream event.
Don't really know how to fix it. This whole reader pipeline with at least 20 different readers gives me a headache :(
Good catch @gleb-cloudius. We should convert that finally() to then(). We want to allow caller code to handle exceptions and not unconditionally close the stream as if everything was all right.
On Sun, Dec 13, 2020 at 10:02:08PM -0800, Botond D茅nes wrote:
Good catch @gleb-cloudius. We should convert that
finally()tothen(). We want to allow caller code to handle exceptions and not unconditionally close the stream as if everything was all right.I've already sent a patch.
--
Gleb.
Gleb's patch was tested with the same job using RebuilStreamingErr nemesis that used to reproduce it easily.
The nemesis ran 7 times and no coredump was found.
https://jenkins.scylladb.com/view/master/job/scylla-master/job/longevity/job/byo-longevity-test/119/
Thanks @roydahan for the early testing, we should do more of this.
Backported to 4.1, 4.2, 4.3.