Scylla: Coredump on node after another node have Stopstart scylla process with error multishard_mutation_query - looked-up reader belongs to different semaphore than the one appropriate for this query class.

Created on 26 Aug 2020  路  90Comments  路  Source: scylladb/scylla

Installation details
Scylla version (or git commit hash): Scylla version 4.2.rc3-0.20200823.48d79a1d9 with build-id 25354a7731343d354888b0a41ceeb8dac437d78d
Cluster size: 6
OS (RHEL/CentOS/Ubuntu/AWS AMI): ami-0f0e2f160a57c7c15 (eu-north-1)

During the job longevity-lwt-24h-test after scylla was stopped and start on node2 without any errors, coredump on node4 generated with next error:

2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !ERR     | scylla: [shard 0] multishard_mutation_query - looked-up reader belongs to different semaphore than the one appropriate for this query class: looked-up reader belongs to _system_read_concurrency_sem (0x6000096bf090) the query class appropriate is _read_concurrency_sem (0x6000096bedd8), at:    0x331d23d#012   0x331d550#012   0x331d9d9#012   0x2e2ea6c#012   0x26c7697#012   0x26c7d89#012   0x26c854e#012   0x26c86ab#012   0x26c97b9#012   0x26cb6d6#012   0x196275b#012   0x19637a0#012   0x198e1ad#012   0x1a688f5#012   0x1a69cc8#012   0x1a6b7e3#012   0x1a7717c#012   0x1a77d24#012   0x19506a0#012   0x1982264#012   0x1985fd5#012   0x19f14bc#012   0x19f33d7#012   0x2409b33#012   0x240ce36#012   0x16f0837#012   0x170bffb#012   0x170c01d#012   0x1715d73#012   0x2dfa5e6#012   0x2e63c57#012   0x2e63fce#012   0x2e9b0ed#012   0x2df7e7a#012   0x2df852e#012   0xd9f550#012   /opt/scylladb/libreloc/libc.so.6+0x27041#012   0xcbbe4d#012   --------#012   seastar::lambda_task<seastar::execution_stage::flush()::{lambda()#1}>
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: Aborting on shard 0.
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: Backtrace:
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002ec2372
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002e668b0
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002e66b55
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002e66ba0
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00007fe149211a8f
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000025894
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002e2ea92
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000026c7697
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000026c7d89
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000026c854e
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000026c86ab
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000026c97b9
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000026cb6d6
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x000000000196275b
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000019637a0
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x000000000198e1ad
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000001a688f5
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000001a69cc8
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000001a6b7e3
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000001a7717c
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000001a77d24
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000019506a0
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000001982264
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000001985fd5
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000019f14bc
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000019f33d7
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002409b33
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x000000000240ce36
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x00000000016f0837
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x000000000170bffb
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x000000000170c01d
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000001715d73
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002dfa5e6
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002e63c57
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002e63fce
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002e9b0ed
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002df7e7a
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000002df852e
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000000d9f550
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000027041
2020-08-24T19:21:43+00:00  longevity-lwt-24h-4-2-db-node-1e6246d3-4 !INFO    | scylla: 0x0000000000cbbe4d

decoded backtrace:

[centos@abykov-gemini-staging-abykov-db-node-846cc4ca-1 ~]$ sudo find / -name scylla*.debug
/usr/lib/debug/opt/scylladb/libexec/scylla-4.2.rc3-0.20200823.48d79a1d9.x86_64.debug
[centos@abykov-gemini-staging-abykov-db-node-846cc4ca-1 ~]$ addr2line -Cpife /usr/lib/debug/opt/scylladb/libexec/scylla-4.2.rc3-0.20200823.48d79a1d9.x86_64.debug 0x0000000002ec2372 0x0000000002e668b0 0x0000000002e66b55 0x0000000002e66ba0 0x00007fe149211a8f /opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4 /opt/scylladb/libreloc/libc.so.6+0x0000000000025894 0x0000000002e2ea92 0x00000000026c7697 0x00000000026c7d89 0x00000000026c854e 0x00000000026c86ab 0x00000000026c97b9 0x00000000026cb6d6 0x000000000196275b 0x00000000019637a0 0x000000000198e1ad 0x0000000001a688f5 0x0000000001a69cc8 0x0000000001a6b7e3 0x0000000001a7717c 0x0000000001a77d24 0x00000000019506a0 0x0000000001982264 0x0000000001985fd5 0x00000000019f14bc 0x00000000019f33d7 0x0000000002409b33 0x000000000240ce36 0x00000000016f0837 0x000000000170bffb 0x000000000170c01d 0x0000000001715d73 0x0000000002dfa5e6 0x0000000002e63c57 0x0000000002e63fce 0x0000000002e9b0ed 0x0000000002df7e7a 0x0000000002df852e 0x0000000000d9f550 /opt/scylladb/libreloc/libc.so.6+0x0000000000027041 0x0000000000cbbe4d 
void seastar::backtrace<seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}>(seastar::backtrace_buffer::append_backtrace()::{lambda(seastar::frame)#1}&&) at /usr/include/fmt/format.h:2188
seastar::backtrace_buffer::append_backtrace() at /usr/include/fmt/format.h:2188
 (inlined by) print_with_backtrace at /jenkins/workspace/scylla-4.2/next/scylla/seastar/src/core/reactor.cc:751
seastar::print_with_backtrace(char const*) at /usr/include/fmt/format.h:2188
sigabrt_action at /usr/include/fmt/format.h:2188
 (inlined by) operator() at /jenkins/workspace/scylla-4.2/next/scylla/seastar/src/core/reactor.cc:3448
 (inlined by) _FUN at /jenkins/workspace/scylla-4.2/next/scylla/seastar/src/core/reactor.cc:3444
?? ??:0
?? ??:0
?? ??:0
seastar::on_internal_error(seastar::logger&, std::basic_string_view<char, std::char_traits<char> >) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/src/core/on_internal_error.cc:39 (discriminator 2)
read_context::lookup_readers()::{lambda(unsigned int)#1}::operator()(unsigned int) const::{lambda(database&)#1}::operator()(database) at multishard_mutation_query.cc:?
read_context::lookup_readers()::{lambda(unsigned int)#1}::operator()(unsigned int) const at multishard_mutation_query.cc:?
 (inlined by) __invoke<read_context::lookup_readers()::<lambda(seastar::shard_id)>::<lambda(database&)>, database&> at /usr/include/c++/10/bits/invoke.h:96
 (inlined by) __apply_impl<read_context::lookup_readers()::<lambda(seastar::shard_id)>::<lambda(database&)>, std::tuple<database&>, 0> at /usr/include/c++/10/tuple:1724
 (inlined by) apply<read_context::lookup_readers()::<lambda(seastar::shard_id)>::<lambda(database&)>, std::tuple<database&> > at /usr/include/c++/10/tuple:1736
 (inlined by) operator() at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/sharded.hh:383
 (inlined by) invoke<seastar::sharded<T>::invoke_on<read_context::lookup_readers()::<lambda(seastar::shard_id)>::<lambda(database&)>, {}, seastar::future<read_context::reader_meta> >::<lambda()> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:1961
 (inlined by) submit_to<seastar::sharded<T>::invoke_on<read_context::lookup_readers()::<lambda(seastar::shard_id)>::<lambda(database&)>, {}, seastar::future<read_context::reader_meta> >::<lambda()> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/smp.hh:326
 (inlined by) invoke_on<read_context::lookup_readers()::<lambda(seastar::shard_id)>::<lambda(database&)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/sharded.hh:384
 (inlined by) invoke_on<read_context::lookup_readers()::<lambda(seastar::shard_id)>::<lambda(database&)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/sharded.hh:399
 (inlined by) operator() at /jenkins/workspace/scylla-4.2/next/scylla/multishard_mutation_query.cc:542
read_context::lookup_readers() at multishard_mutation_query.cc:?
 (inlined by) futurize_invoke<read_context::lookup_readers()::<lambda(seastar::shard_id)>, unsigned int> at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2110
 (inlined by) parallel_for_each<boost::range_detail::integer_iterator<unsigned int>, read_context::lookup_readers()::<lambda(seastar::shard_id)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future-util.hh:159
 (inlined by) parallel_for_each_impl<boost::integer_range<unsigned int>, read_context::lookup_readers()::<lambda(seastar::shard_id)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future-util.hh:204
 (inlined by) invoke<seastar::future<> (*&)(boost::integer_range<unsigned int>&&, read_context::lookup_readers()::<lambda(seastar::shard_id)>&&), boost::integer_range<unsigned int>, read_context::lookup_readers()::<lambda(seastar::shard_id)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2026
 (inlined by) futurize_invoke<seastar::future<> (*&)(boost::integer_range<unsigned int>&&, read_context::lookup_readers()::<lambda(seastar::shard_id)>&&), boost::integer_range<unsigned int>, read_context::lookup_readers()::<lambda(seastar::shard_id)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2110
 (inlined by) parallel_for_each<boost::integer_range<unsigned int>, read_context::lookup_readers()::<lambda(seastar::shard_id)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future-util.hh:216
 (inlined by) read_context::lookup_readers() at /jenkins/workspace/scylla-4.2/next/scylla/multishard_mutation_query.cc:545
auto seastar::internal::do_with_impl<seastar::shared_ptr<read_context>, do_query_mutations(seastar::sharded<database>&, seastar::lw_shared_ptr<schema const>, query::read_command const&, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > > const&, tracing::trace_state_ptr, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, query::result_memory_accounter&&)::{lambda(seastar::shared_ptr<read_context>&)#1}>(seastar::shared_ptr<read_context>&&, do_query_mutations(seastar::sharded<database>&, seastar::lw_shared_ptr<schema const>, query::read_command const&, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > > const&, tracing::trace_state_ptr, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, query::result_memory_accounter&&)::{lambda(seastar::shared_ptr<read_context>&)#1}&&) at multishard_mutation_query.cc:?
 (inlined by) __invoke_impl<seastar::future<reconcilable_result>, do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)>&, seastar::shared_ptr<read_context>&> at /usr/include/c++/10/bits/invoke.h:60
 (inlined by) __invoke<do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)>&, seastar::shared_ptr<read_context>&> at /usr/include/c++/10/bits/invoke.h:96
 (inlined by) __apply_impl<do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)>&, std::tuple<seastar::shared_ptr<read_context> >&, 0> at /usr/include/c++/10/tuple:1724
 (inlined by) apply<do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)>&, std::tuple<seastar::shared_ptr<read_context> >&> at /usr/include/c++/10/tuple:1736
 (inlined by) do_with_impl<seastar::shared_ptr<read_context>, do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/do_with.hh:106
do_query_mutations(seastar::sharded<database>&, seastar::lw_shared_ptr<schema const>, query::read_command const&, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > > const&, tracing::trace_state_ptr, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, query::result_memory_accounter&&) at multishard_mutation_query.cc:?
 (inlined by) futurize_invoke<seastar::future<reconcilable_result> (*&)(seastar::shared_ptr<read_context>&&, do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)>&&), seastar::shared_ptr<read_context>, do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2110
 (inlined by) do_with<seastar::shared_ptr<read_context>, do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/do_with.hh:141
 (inlined by) do_query_mutations at /jenkins/workspace/scylla-4.2/next/scylla/multishard_mutation_query.cc:653
query_mutations_on_all_shards(seastar::sharded<database>&, seastar::lw_shared_ptr<schema const>, query::read_command const&, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > > const&, tracing::trace_state_ptr, unsigned long, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at multishard_mutation_query.cc:?
 (inlined by) __invoke_impl<seastar::future<std::tuple<seastar::foreign_ptr<seastar::lw_shared_ptr<reconcilable_result> >, cache_temperature> >, query_mutations_on_all_shards(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(query::result_memory_accounter)>, query::result_memory_accounter> at /usr/include/c++/10/bits/invoke.h:60
 (inlined by) __invoke<query_mutations_on_all_shards(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(query::result_memory_accounter)>, query::result_memory_accounter> at /usr/include/c++/10/bits/invoke.h:96
 (inlined by) __apply_impl<query_mutations_on_all_shards(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(query::result_memory_accounter)>, std::tuple<query::result_memory_accounter>, 0> at /usr/include/c++/10/tuple:1724
 (inlined by) apply<query_mutations_on_all_shards(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(query::result_memory_accounter)>, std::tuple<query::result_memory_accounter> > at /usr/include/c++/10/tuple:1736
 (inlined by) apply<query_mutations_on_all_shards(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(query::result_memory_accounter)>, query::result_memory_accounter> at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2009
 (inlined by) then_impl<query_mutations_on_all_shards(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(query::result_memory_accounter)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:1544
 (inlined by) run<query_mutations_on_all_shards(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(query::result_memory_accounter)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:1198
 (inlined by) then<query_mutations_on_all_shards(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(query::result_memory_accounter)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:1468
 (inlined by) query_mutations_on_all_shards(seastar::sharded<database>&, seastar::lw_shared_ptr<schema const>, query::read_command const&, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > > const&, tracing::trace_state_ptr, unsigned long, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-4.2/next/scylla/multishard_mutation_query.cc:689
operator() at /usr/include/fmt/format.h:1316
 (inlined by) __invoke_impl<seastar::future<seastar::rpc::tuple<seastar::foreign_ptr<seastar::lw_shared_ptr<reconcilable_result> >, cache_temperature> >, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)>&, seastar::lw_shared_ptr<query::read_command>&, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >&> at /usr/include/c++/10/bits/invoke.h:60
 (inlined by) __invoke<service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)>&, seastar::lw_shared_ptr<query::read_command>&, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >&> at /usr/include/c++/10/bits/invoke.h:96
 (inlined by) __apply_impl<service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)>&, std::tuple<seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > > >&, 0, 1> at /usr/include/c++/10/tuple:1724
 (inlined by) apply<service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)>&, std::tuple<seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > > >&> at /usr/include/c++/10/tuple:1736
 (inlined by) do_with_impl<seastar::lw_shared_ptr<query::read_command>&, const std::vector<nonwrapping_interval<dht::ring_position> >, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/do_with.hh:106
invoke<seastar::future<seastar::rpc::tuple<seastar::foreign_ptr<seastar::lw_shared_ptr<reconcilable_result> >, cache_temperature> > (*&)(seastar::lw_shared_ptr<query::read_command>&, const std::vector<nonwrapping_interval<dht::ring_position> >&&, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)>&&), seastar::lw_shared_ptr<query::read_command>&, const std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)> > at /usr/include/fmt/format.h:1316
 (inlined by) futurize_invoke<seastar::future<seastar::rpc::tuple<seastar::foreign_ptr<seastar::lw_shared_ptr<reconcilable_result> >, cache_temperature> > (*&)(seastar::lw_shared_ptr<query::read_command>&, const std::vector<nonwrapping_interval<dht::ring_position> >&&, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)>&&), seastar::lw_shared_ptr<query::read_command>&, const std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2110
 (inlined by) do_with<seastar::lw_shared_ptr<query::read_command>&, const std::vector<nonwrapping_interval<dht::ring_position> >, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)> > at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/do_with.hh:141
 (inlined by) service::storage_proxy::query_nonsingular_mutations_locally(seastar::lw_shared_ptr<schema const>, seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > > const&&, tracing::trace_state_ptr, unsigned long, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-4.2/next/scylla/service/storage_proxy.cc:5172
service::storage_proxy::query_result_local(seastar::lw_shared_ptr<schema const>, seastar::lw_shared_ptr<query::read_command>, nonwrapping_interval<dht::ring_position> const&, query::result_options, tracing::trace_state_ptr, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, unsigned long) at /usr/include/fmt/format.h:1316
service::abstract_read_executor::make_data_request(gms::inet_address, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool) at /jenkins/workspace/scylla-4.2/next/scylla/service/storage_proxy.cc:3428
service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}::operator()(gms::inet_address) const at /jenkins/workspace/scylla-4.2/next/scylla/service/storage_proxy.cc:3474
 (inlined by) seastar::future<> seastar::futurize<seastar::future<> >::invoke<service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}, gms::inet_address&>(service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}&&, gms::inet_address&) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2026
 (inlined by) auto seastar::futurize_invoke<service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}, gms::inet_address&>(service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}&&, gms::inet_address&) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2110
 (inlined by) seastar::future<> seastar::parallel_for_each<__gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}>(__gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, seastar::future<>, service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}&&) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future-util.hh:159
service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool) at /jenkins/workspace/scylla-4.2/next/scylla/service/storage_proxy.cc:3486
 (inlined by) service::abstract_read_executor::make_requests(seastar::shared_ptr<service::digest_read_resolver>, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}::operator()() const at /jenkins/workspace/scylla-4.2/next/scylla/service/storage_proxy.cc:3507
 (inlined by) seastar::future<> seastar::futurize<seastar::future<> >::invoke<service::abstract_read_executor::make_requests(seastar::shared_ptr<service::digest_read_resolver>, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}>(service::abstract_read_executor::make_requests(seastar::shared_ptr<service::digest_read_resolver>, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&&) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2026
 (inlined by) auto seastar::futurize_invoke<service::abstract_read_executor::make_requests(seastar::shared_ptr<service::digest_read_resolver>, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}>(service::abstract_read_executor::make_requests(seastar::shared_ptr<service::digest_read_resolver>, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&&) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2110
 (inlined by) service::abstract_read_executor::make_requests(seastar::shared_ptr<service::digest_read_resolver>, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-4.2/next/scylla/service/storage_proxy.cc:3507
service::abstract_read_executor::execute(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-4.2/next/scylla/service/storage_proxy.cc:3615
service::range_slice_read_executor::execute(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at /jenkins/workspace/scylla-4.2/next/scylla/service/storage_proxy.cc:3763
operator() at /usr/include/fmt/format.h:1316
 (inlined by) invoke<service::storage_proxy::query_partition_key_range_concurrent(seastar::lowres_clock::time_point, std::vector<seastar::foreign_ptr<seastar::lw_shared_ptr<query::result> > >&&, seastar::lw_shared_ptr<query::read_command>, db::consistency_level, service::query_ranges_to_vnodes_generator&&, int, tracing::trace_state_ptr, uint32_t, uint32_t, service::replicas_per_token_range, service_permit)::<lambda(seastar::shared_ptr<service::abstract_read_executor>&)>&, seastar::shared_ptr<service::abstract_read_executor>&> at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2026
 (inlined by) futurize_invoke<service::storage_proxy::query_partition_key_range_concurrent(seastar::lowres_clock::time_point, std::vector<seastar::foreign_ptr<seastar::lw_shared_ptr<query::result> > >&&, seastar::lw_shared_ptr<query::read_command>, db::consistency_level, service::query_ranges_to_vnodes_generator&&, int, tracing::trace_state_ptr, uint32_t, uint32_t, service::replicas_per_token_range, service_permit)::<lambda(seastar::shared_ptr<service::abstract_read_executor>&)>&, seastar::shared_ptr<service::abstract_read_executor>&> at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2110
 (inlined by) map_reduce<__gnu_cxx::__normal_iterator<seastar::shared_ptr<service::abstract_read_executor>*, std::vector<seastar::shared_ptr<service::abstract_read_executor> > >, service::storage_proxy::query_partition_key_range_concurrent(seastar::lowres_clock::time_point, std::vector<seastar::foreign_ptr<seastar::lw_shared_ptr<query::result> > >&&, seastar::lw_shared_ptr<query::read_command>, db::consistency_level, service::query_ranges_to_vnodes_generator&&, int, tracing::trace_state_ptr, uint32_t, uint32_t, service::replicas_per_token_range, service_permit)::<lambda(seastar::shared_ptr<service::abstract_read_executor>&)>, query::result_merger> at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future-util.hh:1036
service::storage_proxy::query_partition_key_range_concurrent(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, std::vector<seastar::foreign_ptr<seastar::lw_shared_ptr<query::result> >, std::allocator<seastar::foreign_ptr<seastar::lw_shared_ptr<query::result> > > >&&, seastar::lw_shared_ptr<query::read_command>, db::consistency_level, service::query_ranges_to_vnodes_generator&&, int, tracing::trace_state_ptr, unsigned int, unsigned int, std::unordered_map<nonwrapping_interval<dht::token>, std::vector<utils::UUID, std::allocator<utils::UUID> >, std::hash<nonwrapping_interval<dht::token> >, std::equal_to<nonwrapping_interval<dht::token> >, std::allocator<std::pair<nonwrapping_interval<dht::token> const, std::vector<utils::UUID, std::allocator<utils::UUID> > > > >, service_permit) at /usr/include/fmt/format.h:1316
service::storage_proxy::query_partition_key_range(seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >, db::consistency_level, service::storage_proxy::coordinator_query_options) at /usr/include/fmt/format.h:1316
service::storage_proxy::do_query(seastar::lw_shared_ptr<schema const>, seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >&&, db::consistency_level, service::storage_proxy::coordinator_query_options) at /usr/include/fmt/format.h:1316
service::storage_proxy::query(seastar::lw_shared_ptr<schema const>, seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >&&, db::consistency_level, service::storage_proxy::coordinator_query_options) at /usr/include/fmt/format.h:1316
service::pager::query_pager::do_fetch_page(unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at query_pagers.cc:?
service::pager::query_pager::fetch_page_generator(unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, cql3::cql_stats&) at query_pagers.cc:?
cql3::statements::select_statement::do_execute(service::storage_proxy&, service::query_state&, cql3::query_options const&) const at /usr/include/fmt/format.h:1316
seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > std::__invoke_impl<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> >, seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement::* const&)(service::storage_proxy&, service::query_state&, cql3::query_options const&) const, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>(std::__invoke_memfun_deref, seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement::* const&)(service::storage_proxy&, service::query_state&, cql3::query_options const&) const, cql3::statements::select_statement const*&&, service::storage_proxy&, service::query_state&, cql3::query_options const&) at /usr/include/fmt/format.h:1316
 (inlined by) std::__invoke_result<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement::* const&)(service::storage_proxy&, service::query_state&, cql3::query_options const&) const, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>::type std::__invoke<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement::* const&)(service::storage_proxy&, service::query_state&, cql3::query_options const&) const, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>(seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement::* const&)(service::storage_proxy&, service::query_state&, cql3::query_options const&) const, cql3::statements::select_statement const*&&, service::storage_proxy&, service::query_state&, cql3::query_options const&) at /usr/include/c++/10/bits/invoke.h:96
 (inlined by) decltype (__invoke((*this)._M_pmf, (forward<cql3::statements::select_statement const*>)({parm#1}), (forward<service::storage_proxy&>)({parm#1}), (forward<service::query_state&>)({parm#1}), (forward<cql3::query_options const&>)({parm#1}))) std::_Mem_fn_base<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement::*)(service::storage_proxy&, service::query_state&, cql3::query_options const&) const, true>::operator()<cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>(cql3::statements::select_statement const*&&, service::storage_proxy&, service::query_state&, cql3::query_options const&) const at /usr/include/c++/10/functional:122
 (inlined by) seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>::direct_vtable_for<std::_Mem_fn<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement::*)(service::storage_proxy&, service::query_state&, cql3::query_options const&) const> >::call(seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)> const*, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/util/noncopyable_function.hh:101
seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>::operator()(cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&) const at /usr/include/fmt/format.h:1316
 (inlined by) seastar::inheriting_concrete_execution_stage<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> >, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>::make_stage_for_group(seastar::scheduling_group)::{lambda(cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)#1}::operator()(cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&) const at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/execution_stage.hh:329
 (inlined by) seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>::direct_vtable_for<seastar::inheriting_concrete_execution_stage<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> >, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>::make_stage_for_group(seastar::scheduling_group)::{lambda(cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)#1}>::call(seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)> const*, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/util/noncopyable_function.hh:101
seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>::operator()(cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&) const at /usr/include/fmt/format.h:1316
 (inlined by) seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > std::__invoke_impl<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> >, seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>&, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>(std::__invoke_other, seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>&, cql3::statements::select_statement const*&&, service::storage_proxy&, service::query_state&, cql3::query_options const&) at /usr/include/c++/10/bits/invoke.h:60
 (inlined by) std::__invoke_result<seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>&, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>::type std::__invoke<seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>&, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>(seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>&, cql3::statements::select_statement const*&&, service::storage_proxy&, service::query_state&, cql3::query_options const&) at /usr/include/c++/10/bits/invoke.h:96
 (inlined by) _ZSt12__apply_implIRN7seastar20noncopyable_functionIFNS0_6futureIJNS0_10shared_ptrIN13cql_transport8messages14result_messageEEEEEEPKN4cql310statements16select_statementERN7service13storage_proxyERNSE_11query_stateERKNS9_13query_optionsEEEESt5tupleIJSD_SG_SI_SL_EEJLm0ELm1ELm2ELm3EEEDcOT_OT0_St16integer_sequenceImJXspT1_EEE at /usr/include/c++/10/tuple:1724
 (inlined by) _ZSt5applyIRN7seastar20noncopyable_functionIFNS0_6futureIJNS0_10shared_ptrIN13cql_transport8messages14result_messageEEEEEEPKN4cql310statements16select_statementERN7service13storage_proxyERNSE_11query_stateERKNS9_13query_optionsEEEESt5tupleIJSD_SG_SI_SL_EEEDcOT_OT0_ at /usr/include/c++/10/tuple:1736
 (inlined by) seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > seastar::futurize<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > >::apply<seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>&, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>(seastar::noncopyable_function<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> > (cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&)>&, std::tuple<cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>&&) at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:2009
 (inlined by) seastar::concrete_execution_stage<seastar::future<seastar::shared_ptr<cql_transport::messages::result_message> >, cql3::statements::select_statement const*, service::storage_proxy&, service::query_state&, cql3::query_options const&>::do_flush() at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/execution_stage.hh:247
operator() at /usr/include/fmt/format.h:652
 (inlined by) invoke<seastar::execution_stage::flush()::<lambda()>&> at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/future.hh:1993
 (inlined by) run_and_dispose at /jenkins/workspace/scylla-4.2/next/scylla/seastar/include/seastar/core/make_task.hh:40
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at /usr/include/fmt/format.h:2188
seastar::reactor::run_some_tasks() at /usr/include/fmt/format.h:2188
seastar::reactor::run_some_tasks() at /usr/include/fmt/format.h:2188
 (inlined by) seastar::reactor::run() at /jenkins/workspace/scylla-4.2/next/scylla/seastar/src/core/reactor.cc:2712
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at /usr/include/boost/program_options/variables_map.hpp:146
seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at /usr/include/boost/program_options/variables_map.hpp:146
main at /jenkins/workspace/scylla-4.2/next/scylla/main.cc:488
?? ??:0
_start at ??:?

Coredump info from node4:

2020-08-24 19:21:43.000: (CoreDumpEvent Severity.ERROR): node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.4004d2cdf3764a849f3a8911f8b0f317.1290.1598296903000000/core.scylla.996.4004d2cdf3764a849f3a8911f8b0f317.1290.1598296903000000.gz
backtrace=           PID: 1290 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Mon 2020-08-24 19:21:43 UTC (2min 50s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /scylla.slice/scylla-server.slice/scylla-server.service
          Unit: scylla-server.service
         Slice: scylla-server.slice
       Boot ID: 4004d2cdf3764a849f3a8911f8b0f317
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: longevity-lwt-24h-4-2-db-node-1e6246d3-4
      Coredump: /var/lib/systemd/coredump/core.scylla.996.4004d2cdf3764a849f3a8911f8b0f317.1290.1598296903000000
       Message: Process 1290 (scylla) of user 996 dumped core.

                Stack trace of thread 1290:
                #0  0x00007fe1487c49e5 raise (libc.so.6)
                #1  0x00007fe1487ad94d abort (libc.so.6)
                #2  0x0000000002e2ea93 _ZN7seastar17on_internal_errorERNS_6loggerESt17basic_string_viewIcSt11char_traitsIcEE (scylla)
                #3  0x00000000026c7698 _ZZZN12read_context14lookup_readersEvENKUljE_clEjENUlR8databaseE_clES2_ (scylla)
                #4  0x00000000026c7d8a _ZZN12read_context14lookup_readersEvENKUljE_clEj (scylla)
                #5  0x00000000026c854f invoke<read_context::lookup_readers()::<lambda(seastar::shard_id)>, unsigned int> (scylla)
                #6  0x00000000026c86ac operator() (scylla)
                #7  0x00000000026c97ba invoke<seastar::future<reconcilable_result> (*&)(seastar::shared_ptr<read_context>&&, do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)>&&), seastar::shared_ptr<read_context>, do_query_mutations(seastar::distributed<database>&, schema_ptr, const query::read_command&, const partition_range_vector&, tracing::trace_state_ptr, seastar::lowres_clock::time_point, query::result_memory_accounter&&)::<lambda(seastar::shared_ptr<read_context>&)> > (scylla)
                #8  0x00000000026cb6d7 operator() (scylla)
                #9  0x000000000196275c operator() (scylla)
                #10 0x00000000019637a1 invoke<seastar::future<seastar::rpc::tuple<seastar::foreign_ptr<seastar::lw_shared_ptr<reconcilable_result> >, cache_temperature> > (*&)(seastar::lw_shared_ptr<query::read_command>&, const std::vector<nonwrapping_interval<dht::ring_position> >&&, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)>&&), seastar::lw_shared_ptr<query::read_command>&, const std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >, service::storage_proxy::query_nonsingular_mutations_locally(schema_ptr, seastar::lw_shared_ptr<query::read_command>, const partition_range_vector&&, tracing::trace_state_ptr, uint64_t, seastar::lowres_clock::time_point)::<lambda(seastar::lw_shared_ptr<query::read_command>&, const partition_range_vector&)> > (scylla)
                #11 0x000000000198e1ae _ZN7service13storage_proxy18query_result_localEN7seastar13lw_shared_ptrIK6schemaEENS2_IN5query12read_commandEEERK20nonwrapping_intervalIN3dht13ring_positionEENS6_14result_optionsEN7tracing15trace_state_ptrENSt6chrono10time_pointINS1_12lowres_clockENSI_8durationIlSt5ratioILl1ELl1000EEEEEEm (scylla)
                #12 0x0000000001a688f6 _ZN7service22abstract_read_executor17make_data_requestEN3gms12inet_addressENSt6chrono10time_pointIN7seastar12lowres_clockENS3_8durationIlSt5ratioILl1ELl1000EEEEEEb (scylla)
                #13 0x0000000001a69cc9 _ZZN7service22abstract_read_executor18make_data_requestsEN7seastar10shared_ptrINS_20digest_read_resolverEEEN9__gnu_cxx17__normal_iteratorIPN3gms12inet_addressESt6vectorIS8_SaIS8_EEEESD_NSt6chrono10time_pointINS1_12lowres_clockENSE_8durationIlSt5ratioILl1ELl1000EEEEEEbENKUlS8_E_clES8_ (scylla)
                #14 0x0000000001a6b7e4 _ZN7service22abstract_read_executor18make_data_requestsEN7seastar10shared_ptrINS_20digest_read_resolverEEEN9__gnu_cxx17__normal_iteratorIPN3gms12inet_addressESt6vectorIS8_SaIS8_EEEESD_NSt6chrono10time_pointINS1_12lowres_clockENSE_8durationIlSt5ratioILl1ELl1000EEEEEEb (scylla)
                #15 0x0000000001a7717d _ZN7service22abstract_read_executor7executeENSt6chrono10time_pointIN7seastar12lowres_clockENS1_8durationIlSt5ratioILl1ELl1000EEEEEE (scylla)
                #16 0x0000000001a77d25 _ZN7service25range_slice_read_executor7executeENSt6chrono10time_pointIN7seastar12lowres_clockENS1_8durationIlSt5ratioILl1ELl1000EEEEEE (scylla)
                #17 0x00000000019506a1 operator() (scylla)
                #18 0x0000000001982265 _ZN7service13storage_proxy36query_partition_key_range_concurrentENSt6chrono10time_pointIN7seastar12

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.4004d2cdf3764a849f3a8911f8b0f317.1290.1598296903000000/core.scylla.996.4004d2cdf3764a849f3a8911f8b0f317.1290.1598296903000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.4004d2cdf3764a849f3a8911f8b0f317.1290.1598296903000000.gz

Node2 have next error messages at this moment:

2020-08-24 17:53:15.327: (DisruptionEvent Severity.NORMAL): type=start name=RejectThriftNetwork: node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-3 [13.53.198.215 | 10.0.1.254] (seed: False) duration=None
2020-08-24 18:01:36.089: (DisruptionEvent Severity.NORMAL): type=end name=RejectThriftNetwork: node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-3 [13.53.198.215 | 10.0.1.254] (seed: False) duration=697
2020-08-24 18:16:33.980: (DisruptionEvent Severity.NORMAL): type=start name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-1 [13.53.124.246 | 10.0.0.194] (seed: True) duration=None
2020-08-24 18:17:51.197: (DisruptionEvent Severity.NORMAL): type=start name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=None
2020-08-24 18:18:32.166: (DisruptionEvent Severity.NORMAL): type=start name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-3 [13.53.198.215 | 10.0.1.254] (seed: False) duration=None
2020-08-24 18:19:53.748: (DisruptionEvent Severity.NORMAL): type=start name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 18:20:40.748: (DisruptionEvent Severity.NORMAL): type=end name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-3 [13.53.198.215 | 10.0.1.254] (seed: False) duration=457
2020-08-24 18:35:32.666: (DisruptionEvent Severity.NORMAL): type=start name=SoftRebootNode node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 18:39:00.191: (DisruptionEvent Severity.NORMAL): type=end name=SoftRebootNode node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=419
2020-08-24 18:54:59.612: (DisruptionEvent Severity.NORMAL): type=start name=AbortRepairMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 18:56:38.032: (DisruptionEvent Severity.NORMAL): type=end name=AbortRepairMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=313
2020-08-24 19:11:37.681: (DisruptionEvent Severity.NORMAL): type=start name=StopWaitStartService node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=None
2020-08-24 19:18:58.456: (DisruptionEvent Severity.NORMAL): type=end name=StopWaitStartService node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=659
2020-08-24 19:31:33.658: (DisruptionEvent Severity.NORMAL): type=start name=SnapshotOperations node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 19:32:02.393: (DisruptionEvent Severity.NORMAL): type=end name=SnapshotOperations node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=146
2020-08-24 19:48:01.550: (DisruptionEvent Severity.NORMAL): type=start name=ToggleTableICS node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=None
2020-08-24 19:48:05.811: (DisruptionEvent Severity.NORMAL): type=skipped name=ToggleTableICS node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=237
2020-08-24 19:58:02.946: (DisruptionEvent Severity.NORMAL): type=skipped name=ChaosMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=228
2020-08-24 20:07:46.248: (DisruptionEvent Severity.NORMAL): type=start name=Enospc node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=None
2020-08-24 20:11:32.596: (DisruptionEvent Severity.NORMAL): type=end name=Enospc node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=452
2020-08-24 20:24:36.263: (DisruptionEvent Severity.NORMAL): type=start name=ManagementRepair node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 20:24:36.884: (DisruptionEvent Severity.NORMAL): type=skipped name=ManagementRepair node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=246
2020-08-24 20:34:26.240: (DisruptionEvent Severity.NORMAL): type=start name=ModifyTablePropertiesMinIndexInterval node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 20:34:28.404: (DisruptionEvent Severity.NORMAL): type=end name=ModifyTablePropertiesMinIndexInterval node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=231

Before core happened, next nemesises were applied to cluster nodes:

2020-08-24 17:53:15.327: (DisruptionEvent Severity.NORMAL): type=start name=RejectThriftNetwork: node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-3 [13.53.198.215 | 10.0.1.254] (seed: False) duration=None
2020-08-24 18:01:36.089: (DisruptionEvent Severity.NORMAL): type=end name=RejectThriftNetwork: node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-3 [13.53.198.215 | 10.0.1.254] (seed: False) duration=697
2020-08-24 18:16:33.980: (DisruptionEvent Severity.NORMAL): type=start name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-1 [13.53.124.246 | 10.0.0.194] (seed: True) duration=None
2020-08-24 18:17:51.197: (DisruptionEvent Severity.NORMAL): type=start name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=None
2020-08-24 18:18:32.166: (DisruptionEvent Severity.NORMAL): type=start name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-3 [13.53.198.215 | 10.0.1.254] (seed: False) duration=None
2020-08-24 18:19:53.748: (DisruptionEvent Severity.NORMAL): type=start name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 18:20:40.748: (DisruptionEvent Severity.NORMAL): type=end name=NodetoolCleanupMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-3 [13.53.198.215 | 10.0.1.254] (seed: False) duration=457
2020-08-24 18:35:32.666: (DisruptionEvent Severity.NORMAL): type=start name=SoftRebootNode node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 18:39:00.191: (DisruptionEvent Severity.NORMAL): type=end name=SoftRebootNode node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=419
2020-08-24 18:54:59.612: (DisruptionEvent Severity.NORMAL): type=start name=AbortRepairMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 18:56:38.032: (DisruptionEvent Severity.NORMAL): type=end name=AbortRepairMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=313
2020-08-24 19:11:37.681: (DisruptionEvent Severity.NORMAL): type=start name=StopWaitStartService node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=None
2020-08-24 19:18:58.456: (DisruptionEvent Severity.NORMAL): type=end name=StopWaitStartService node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=659
2020-08-24 19:31:33.658: (DisruptionEvent Severity.NORMAL): type=start name=SnapshotOperations node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 19:32:02.393: (DisruptionEvent Severity.NORMAL): type=end name=SnapshotOperations node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=146
2020-08-24 19:48:01.550: (DisruptionEvent Severity.NORMAL): type=start name=ToggleTableICS node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=None
2020-08-24 19:48:05.811: (DisruptionEvent Severity.NORMAL): type=skipped name=ToggleTableICS node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=237
2020-08-24 19:58:02.946: (DisruptionEvent Severity.NORMAL): type=skipped name=ChaosMonkey node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=228
2020-08-24 20:07:46.248: (DisruptionEvent Severity.NORMAL): type=start name=Enospc node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=None
2020-08-24 20:11:32.596: (DisruptionEvent Severity.NORMAL): type=end name=Enospc node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-2 [13.49.21.86 | 10.0.2.110] (seed: False) duration=452
2020-08-24 20:24:36.263: (DisruptionEvent Severity.NORMAL): type=start name=ManagementRepair node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 20:24:36.884: (DisruptionEvent Severity.NORMAL): type=skipped name=ManagementRepair node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=246
2020-08-24 20:34:26.240: (DisruptionEvent Severity.NORMAL): type=start name=ModifyTablePropertiesMinIndexInterval node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=None
2020-08-24 20:34:28.404: (DisruptionEvent Severity.NORMAL): type=end name=ModifyTablePropertiesMinIndexInterval node=Node longevity-lwt-24h-4-2-db-node-1e6246d3-4 [13.53.123.240 | 10.0.1.165] (seed: False) duration=231

all logs are available by link: https://cloudius-jenkins-test.s3.amazonaws.com/1e6246d3-a3c5-40aa-9fdf-d3f9af969d6d/20200825_182315/db-cluster-1e6246d3.zip

bug high lwt

All 90 comments

The crashing node is a coordinator in the query. I cannot tell from the core how the previous page used the wrong semaphore. I need to think of a way to generate a core while that is happening. Possibly my fix for the messaging-service isn't working in all cases and some statement RPCs still execute in the system group.

Everything in messaging-service looks allright. The verbs are registered with the correct scheduling group and have no isolation cookie set. Of course just because this is correct on this node it doesn't mean that it was also correct on a previous coordinator.

I suspect there is still some read-path that can somehow use the wrong scheduling group, I just don't know what this could be. The core unfortunately is no help as it is generated from the page which uses the correct semaphore and discovers that the previous page didn't.

@aleksbykov would it be possible for you to run the reproducer with a custom binary?

@denesb i think yes
I have some concerns that i can repeat the same nemesis chain.

@aleksbykov how would I supply the new binary to you? Is it enough to just provide a scylla executable, a relacatable package tarball or an RPM?

@denesb the nice variant will be an ami )), but i think scylla executable will be good for now. I will recheck

@denesb could you also provide rpms

@aleksbykov RPM uploaded to gs://scratch.scylladb.com/bdenes/7117/scylla-server-4.2.rc3-0.20200827.48d79a1d9.x86_64.rpm. Download it as:

$ gsutil cp gs://scratch.scylladb.com/bdenes/7117/scylla-server-4.2.rc3-0.20200827.48d79a1d9.x86_64.rpm .

Note scratch is readable via http, just s/^gs:/http:/.

BTW I added the following assertions:

$ git diff
diff --git a/service/storage_proxy.cc b/service/storage_proxy.cc
index 817d43948..c54907839 100644
--- a/service/storage_proxy.cc
+++ b/service/storage_proxy.cc
@@ -3352,6 +3352,15 @@ class data_read_resolver : public abstract_read_resolver {
     }
 };

+static bool is_system_keyspace(const schema_ptr& schema) {
+    static const std::string_view prefix("system");
+    auto s = schema->ks_name();
+    if (s.size() < prefix.size()) {
+        return false;
+    }
+    return std::string_view(s.data(), prefix.size()) == prefix;
+}
+
 class abstract_read_executor : public enable_shared_from_this<abstract_read_executor> {
 protected:
     using targets_iterator = std::vector<gms::inet_address>::iterator;
@@ -3423,6 +3432,9 @@ class abstract_read_executor : public enable_shared_from_this<abstract_read_exec
         auto opts = want_digest
                   ? query::result_options{query::result_request::result_and_digest, digest_algorithm(*_proxy)}
                   : query::result_options{query::result_request::only_result, query::digest_algorithm::none};
+        if (!is_system_keyspace(_schema)) {
+            assert(seastar::current_scheduling_group() != default_scheduling_group());
+        }
         if (fbu::is_me(ep)) {
             tracing::trace(_trace_state, "read_data: querying locally");
             return _proxy->query_result_local(_schema, _cmd, _partition_range, opts, _trace_state, timeout);
@@ -3863,6 +3875,9 @@ storage_proxy::query_result_local_digest(schema_ptr s, lw_shared_ptr<query::read
 future<rpc::tuple<foreign_ptr<lw_shared_ptr<query::result>>, cache_temperature>>
 storage_proxy::query_result_local(schema_ptr s, lw_shared_ptr<query::read_command> cmd, const dht::partition_range& pr, query::result_options opts,
                                   tracing::trace_state_ptr trace_state, storage_proxy::clock_type::time_point timeout, uint64_t max_size) {
+    if (!is_system_keyspace(s)) {
+        assert(seastar::current_scheduling_group() != default_scheduling_group());
+    }
     cmd->slice.options.set_if<query::partition_slice::option::with_digest>(opts.request != query::result_request::only_result);
     if (pr.is_singular()) {
         unsigned shard = dht::shard_of(*s, pr.start()->value().token());

It asserts that non-system tables are not read in the default (system) group. The assert is added to both the coordinator and replica side.

Job with scylla binary from @denesb is running https://jenkins.scylladb.com/job/scylla-4.2/job/Reproducers/job/reproduce-issue-7117/1/execution/node/74/log/
Scylla Monitoring: http://13.48.133.203:3000/d/rFsNarNGz/scylla-per-server-metrics-nemesis-master?orgId=1&refresh=30s
Used same chain of nemesis. Waiting the results

@denesb i got coredump with your binary, but during another nemesis. Here is info about it.

2020-08-28 16:44:46.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-af3e9ef0-3 [13.53.37.190 | 10.0.2.252] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.5e5e736aa303412796a120965a872a87.25670.1598633086000000/core.scylla.996.5e5e736aa303412796a120965a872a87.25670.1598633086000000.gz
backtrace=           PID: 25670 (scylla)
UID: 996 (scylla)
GID: 1001 (scylla)
Signal: 6 (ABRT)
Timestamp: Fri 2020-08-28 16:44:46 UTC (2min 19s ago)
Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
Executable: /opt/scylladb/libexec/scylla
Control Group: /scylla.slice/scylla-server.slice/scylla-server.service
Unit: scylla-server.service
Slice: scylla-server.slice
Boot ID: 5e5e736aa303412796a120965a872a87
Machine ID: df877a200226bc47d06f26dae0736ec9
Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-af3e9ef0-3
Coredump: /var/lib/systemd/coredump/core.scylla.996.5e5e736aa303412796a120965a872a87.25670.1598633086000000
Message: Process 25670 (scylla) of user 996 dumped core.
Stack trace of thread 25670:
#0  0x00007f296f0a69e5 raise (libc.so.6)
#1  0x00007f296f08f94d abort (libc.so.6)
#2  0x00007f296f08f769 __assert_fail_base.cold (libc.so.6)
#3  0x00007f296f09ee76 __assert_fail (libc.so.6)
#4  0x0000000001a68045 _ZN7service22abstract_read_executor17make_data_requestEN3gms12inet_addressENSt6chrono10time_pointIN7seastar12lowres_clockENS3_8durationIlSt5ratioILl1ELl1000EEEEEEb (scylla)
#5  0x0000000001a68449 _ZN7seastar17parallel_for_eachIN9__gnu_cxx17__normal_iteratorIPN3gms12inet_addressESt6vectorIS4_SaIS4_EEEEZN7service22abstract_read_executor18make_data_requestsENS_10shared_ptrINSA_20digest_read_resolverEEES9_S9_NSt6chrono10time_pointINS_12lowres_clockENSF_8durationIlSt5ratioILl1ELl1000EEEEEEbEUlS4_E_EENS_6futureIJEEET_SQ_OT0_ (scylla)
#6  0x0000000001a6aa8f _ZN7service25speculating_read_executor13make_requestsEN7seastar10shared_ptrINS_20digest_read_resolverEEENSt6chrono10time_pointINS1_12lowres_clockENS5_8durationIlSt5ratioILl1ELl1000EEEEEE (scylla)
#7  0x0000000001a7513d _ZN7service22abstract_read_executor7executeENSt6chrono10time_pointIN7seastar12lowres_clockENS1_8durationIlSt5ratioILl1ELl1000EEEEEE (scylla)
#8  0x00000000019e4b60 _ZN7service13storage_proxy14query_singularEN7seastar13lw_shared_ptrIN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISA_EEN2db17consistency_levelENS0_25coordinator_query_optionsE (scylla)
#9  0x00000000019f17c8 _ZN7service13storage_proxy8do_queryEN7seastar13lw_shared_ptrIK6schemaEENS2_IN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISD_EEN2db17consistency_levelENS0_25coordinator_query_optionsE (scylla)
#10 0x00000000019f32c8 _ZN7service13storage_proxy5queryEN7seastar13lw_shared_ptrIK6schemaEENS2_IN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISD_EEN2db17consistency_levelENS0_25coordinator_query_optionsE (scylla)
#11 0x00000000019f4820 _ZZZZZZN7service13storage_proxy3casEN7seastar13lw_shared_ptrIK6schemaEENS1_10shared_ptrINS_11cas_requestEEENS2_IN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISG_EENS0_25coordinator_query_optionsEN2db17consistency_levelESM_NSt6chrono10time_pointINS1_12lowres_clockENSN_8durationIlSt5ratioILl1ELl1000EEEEEESU_bENUlRjE_clESV_ENUlvE_clEvENUlvE_clEvENUlNS_22paxos_response_handler15ballot_and_dataEE_clES10_ENKUlvE_clEv (scylla)
#12 0x00000000019fe006 _ZN7seastar12continuationINS_8internal22promise_base_with_typeIJSt8optionalIbEEEEZZZZN7service13storage_proxy3casENS_13lw_shared_ptrIK6schemaEENS_10shared_ptrINS6_11cas_requestEEENS8_IN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISM_EENS7_25coordinator_query_optionsEN2db17consistency_levelESS_NSt6chrono10time_pointINS_12lowres_clockENST_8durationIlSt5ratioILl1ELl1000EEEEEES10_bENUlRjE_clES11_ENUlvE_clEvENUlvE_clEvEUlNS6_22paxos_response_handler15ballot_and_dataEE_ZZNS_6futureIJS16_EE14then_impl_nrvoIS17_NS18_IJS4_EEEEET0_OT_ENKUlvE_clEvEUlRS5_RS17_ONS_12future_stateIJS16_EEEE_JS16_EE15run_and_disposeEv (scylla)
#13 0x0000000002e63b58 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
#14 0x0000000002e63ecf _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
#15 0x0000000002e9afee _ZN7seastar7reactor3runEv (scylla)
#16 0x0000000002df7d7b _ZN7seastar12app_template14run_deprecatedEiPPcOSt8functionIFvvEE (scylla)
#17 0x0000000002df842f _ZN7seastar12app_template3runEiPPcOSt8functionIFNS_6futureIJiEEEvEE (scylla)
#18 0x0000000000d9f1f1 main (scylla)
#19 0x00007f296f091042 __libc_start_main (libc.so.6)
#20 0x0000000000cbbaee _start (scylla)
Stack trace of thread 25703:
#0  0x00007f296faf29ac read (libpthread.so.0)
#1  0x00000000030f2857 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
#2  0x00000000030f2ab8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
#3  0x0000000002e2ec1e _ZN7seastar12posix_thread13start_routineEPv (scylla)
#4  0x00007f2
download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.5e5e736aa303412796a120965a872a87.25670.1598633086000000/core.scylla.996.5e5e736aa303412796a120965a872a87.25670.1598633086000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.5e5e736aa303412796a120965a872a87.25670.1598633086000000.gz

I m going to go through logs for more details

Very good. Downloading the core.

(gdb) bt
#0  0x00007f296f0a69e5 in raise () at /opt/scylladb/libreloc/libc.so.6
#1  0x00007f296f08f94d in abort () at /opt/scylladb/libreloc/libc.so.6
#2  0x00007f296f08f769 in _nl_load_domain.cold () at /opt/scylladb/libreloc/libc.so.6
#3  0x00007f296f09ee76 in annobin_assert.c_end () at /opt/scylladb/libreloc/libc.so.6
#4  0x0000000001a68045 in service::abstract_read_executor::make_data_request(gms::inet_address, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool) (this=0x60000840b700, ep=..., timeout=..., want_digest=<optimized out>) at /ScyllaDB/scylla2/seastar/include/seastar/core/future.hh:771
#5  0x0000000001a68449 in service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}::operator()(gms::inet_address) const (ep=..., __closure=0x7ffd32a7bc80) at service/storage_proxy.cc:3486
#8  seastar::parallel_for_each<__gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}>(__gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, seastar::future<>, service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool)::{lambda(gms::inet_address)#1}&&) (begin=..., end=..., func=...) at /ScyllaDB/scylla2/seastar/include/seastar/core/future-util.hh:159
#9  0x0000000001a6aa8f in service::abstract_read_executor::make_data_requests(seastar::shared_ptr<service::digest_read_resolver>, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, __gnu_cxx::__normal_iterator<gms::inet_address*, std::vector<gms::inet_address, std::allocator<gms::inet_address> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, bool) (want_digest=true, timeout=..., end=..., begin=..., resolver=..., this=0x60000840b700) at /ScyllaDB/scylla2/seastar/include/seastar/core/shared_ptr.hh:512
#10 service::speculating_read_executor::make_requests(seastar::shared_ptr<service::digest_read_resolver>, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) (this=0x60000840b700, resolver=..., timeout=...) at service/storage_proxy.cc:3755
#11 0x0000000001a7513d in service::abstract_read_executor::execute(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) (this=0x60000840b700, timeout=...) at /ScyllaDB/scylla2/seastar/include/seastar/core/shared_ptr.hh:505
#12 0x00000000019e4b60 in operator() (executor_and_token_range=..., __closure=0x7ffd32a7bf20) at /ScyllaDB/scylla2/seastar/include/seastar/core/shared_ptr.hh:577
#15 seastar::map_reduce<__gnu_cxx::__normal_iterator<std::pair<seastar::shared_ptr<service::abstract_read_executor>, nonwrapping_interval<dht::token> >*, std::vector<std::pair<seastar::shared_ptr<service::abstract_read_executor>, nonwrapping_interval<dht::token> > > >, service::storage_proxy::query_singular(seastar::lw_shared_ptr<query::read_command>, dht::partition_range_vector&&, db::consistency_level, service::storage_proxy::coordinator_query_options)::<lambda(std::pair<seastar::shared_ptr<service::abstract_read_executor>, nonwrapping_interval<dht::token> >&)>, query::result_merger> (r=..., mapper=..., end=..., begin=...) at /ScyllaDB/scylla2/seastar/include/seastar/core/future-util.hh:1036
#16 service::storage_proxy::query_singular(seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >&&, db::consistency_level, service::storage_proxy::coordinator_query_options) (this=this@entry=0x600000431e00, cmd=..., partition_ranges=..., cl=<optimized out>, query_options=...) at service/storage_proxy.cc:3976
#17 0x00000000019f17c8 in service::storage_proxy::do_query(seastar::lw_shared_ptr<schema const>, seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >&&, db::consistency_level, service::storage_proxy::coordinator_query_options) (this=this@entry=0x600000431e00, s=..., cmd=..., partition_ranges=..., cl=cl@entry=db::consistency_level::QUORUM, query_options=...) at /ScyllaDB/scylla2/seastar/include/seastar/core/shared_ptr.hh:289
#18 0x00000000019f32c8 in service::storage_proxy::query(seastar::lw_shared_ptr<schema const>, seastar::lw_shared_ptr<query::read_command>, std::vector<nonwrapping_interval<dht::ring_position>, std::allocator<nonwrapping_interval<dht::ring_position> > >&&, db::consistency_level, service::storage_proxy::coordinator_query_options) (this=this@entry=0x600000431e00, s=..., cmd=..., partition_ranges=..., cl=db::consistency_level::QUORUM, query_options=...) at /ScyllaDB/scylla2/seastar/include/seastar/core/shared_ptr.hh:289
#19 0x00000000019f4820 in operator()() const (__closure=__closure@entry=0x7ffd32a7c650) at /ScyllaDB/scylla2/seastar/include/seastar/core/shared_ptr.hh:289
#20 0x00000000019fe006 in seastar::continuation<seastar::internal::promise_base_with_type<std::optional<bool> >, service::storage_proxy::cas(schema_ptr, seastar::shared_ptr<service::cas_request>, seastar::lw_shared_ptr<query::read_command>, dht::partition_range_vector&&, service::storage_proxy::coordinator_query_options, db::consistency_level, db::consistency_level, seastar::lowres_clock::time_point, seastar::lowres_clock::time_point, bool)::<lambda(unsigned int&)> mutable::<lambda()> mutable::<lambda()> mutable::<lambda(service::paxos_response_handler::ballot_and_data)>, seastar::future<T>::then_impl_nrvo<service::storage_proxy::cas(schema_ptr, seastar::shared_ptr<service::cas_request>, seastar::lw_shared_ptr<query::read_command>, dht::partition_range_vector&&, service::storage_proxy::coordinator_query_options, db::consistency_level, db::consistency_level, seastar::lowres_clock::time_point, seastar::lowres_clock::time_point, bool)::<lambda(unsigned int&)> mutable::<lambda()> mutable::<lambda()> mutable::<lambda(service::paxos_response_handler::ballot_and_data)>, seastar::future<std::optional<bool> > >::<lambda()>::<lambda(pr_type&, service::storage_proxy::cas(schema_ptr, seastar::shared_ptr<service::cas_request>, seastar::lw_shared_ptr<query::read_command>, dht::partition_range_vector&&, service::storage_proxy::coordinator_query_options, db::consistency_level, db::consistency_level, seastar::lowres_clock::time_point, seastar::lowres_clock::time_point, bool)::<lambda(unsigned int&)> mutable::<lambda()> mutable::<lambda()> mutable::<lambda(service::paxos_response_handler::ballot_and_data)>&, seastar::future_state<service::paxos_response_handler::ballot_and_data>&&)>, service::paxos_response_handler::ballot_and_data>::run_and_dispose(void) (this=0x600002d10640) at service/storage_proxy.cc:4454
#21 0x0000000002e63b58 in seastar::reactor::run_tasks(seastar::reactor::task_queue&) (this=this@entry=0x600000366000, tq=...) at /ScyllaDB/scylla2/seastar/src/core/reactor.cc:2141
#22 0x0000000002e63ecf in seastar::reactor::run_some_tasks() (this=this@entry=0x600000366000) at /ScyllaDB/scylla2/seastar/src/core/reactor.cc:2557
#23 0x0000000002e9afee in seastar::reactor::run_some_tasks() (this=0x600000366000) at /ScyllaDB/scylla2/seastar/include/seastar/core/circular_buffer_fixed_capacity.hh:209
#24 seastar::reactor::run() (this=0x600000366000) at /ScyllaDB/scylla2/seastar/src/core/reactor.cc:2712
#25 0x0000000002df7d7b in seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) (this=this@entry=0x7ffd32a7d250, ac=ac@entry=24, av=av@entry=0x7ffd32a7d5a8, func=...) at /ScyllaDB/scylla2/seastar/include/seastar/core/reactor.hh:736
#26 0x0000000002df842f in seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) (this=this@entry=0x7ffd32a7d250, ac=ac@entry=24, av=av@entry=0x7ffd32a7d5a8, func=...) at /usr/include/c++/10/bits/std_function.h:87
#27 0x0000000000d9f1f1 in main(int, char**) (ac=24, av=0x7ffd32a7d5a8) at /usr/include/c++/10/bits/std_function.h:87

In Frame 20:

(gdb) p _sg
$20 = {
  _id = 5
}

In Frame 21:

(gdb) p tq._id
$21 = 5 '\005'

(gdb) p (scheduling_group)'seastar::internal::current_scheduling_group_ptr()::sg'
$22 = {
  _id = 0
}

current_scheduling_group() doesn't agree with the currently processed task queue.

@avikivity @espindola how can this be?

void reactor::run_tasks(task_queue& tq) {
    // Make sure new tasks will inherit our scheduling group
    *internal::current_scheduling_group_ptr() = scheduling_group(tq._id);
    auto& tasks = tq._q;
    while (!tasks.empty()) {
        auto tsk = tasks.front();
        tasks.pop_front();
        STAP_PROBE(seastar, reactor_run_tasks_single_start);
        task_histogram_add_task(*tsk);
        _current_task = tsk;
        tsk->run_and_dispose();
        _current_task = nullptr;
        STAP_PROBE(seastar, reactor_run_tasks_single_end);
        ++tq._tasks_processed;
        ++_global_tasks_processed;
        // check at end of loop, to allow at least one task to run
        if (need_preempt()) {
            if (tasks.size() <= _max_task_backlog) {
                break;
            } else {
                // While need_preempt() is set, task execution is inefficient due to
                // need_preempt() checks breaking out of loops and .then() calls. See
                // #302.
                reset_preemption_monitor();
            }
        }
    }
}

What could derail the current execution group between the start of the function and running a task?

We had a timer-related bug, but I think it was backported. Let me look it up.

That patch is in 4.2 (scylladb/seastar@fc0a57b2d61f638ff6db8edb63b078a2d6bc5aa9)

I suspect

        template<typename Func>
        futurize_t<std::result_of_t<Func()>> with_locked_key(const dht::token& key, clock_type::time_point timeout, Func func) {
            return with_semaphore(get_semaphore_for_key(key), 1, timeout - clock_type::now(), std::move(func)).finally([key, this] {
                release_semaphore_for_key(key);
            });
        }

Locks mix fibers that can come from different scheduling groups, maybe there's an sg leak.

The locks eventually are implemented with semaphores:

class basic_semaphore : private ExceptionFactory {
public:
    using duration = typename timer<Clock>::duration;
    using clock = typename timer<Clock>::clock;
    using time_point = typename timer<Clock>::time_point;
    using exception_factory = ExceptionFactory;
private:
    ssize_t _count;
    std::exception_ptr _ex;
    struct entry {
        promise<> pr;
        size_t nr;
        entry(promise<>&& pr_, size_t nr_) : pr(std::move(pr_)), nr(nr_) {}
    };

I wonder if the semaphore causes the leak.

e.g. semaphore::signal()'s sg leaking into a waiting operation.

I suspect

        template<typename Func>
        futurize_t<std::result_of_t<Func()>> with_locked_key(const dht::token& key, clock_type::time_point timeout, Func func) {
            return with_semaphore(get_semaphore_for_key(key), 1, timeout - clock_type::now(), std::move(func)).finally([key, this] {
                release_semaphore_for_key(key);
            });
        }

Locks mix fibers that can come from different scheduling groups, maybe there's an sg leak.

This would cause continuations to be scheduled with the wrong scheduling group. But what we see here is that the tasks themselves have the correct group, they are in the correct task-queue but somehow current_scheduling_group() is not correct.
So somehow internal::current_scheduling_group_ptr() was written to in the context of a single reactor::run_tasks() call.
There are only 5 places where we write to internal::current_scheduling_group_ptr():

$ ag current_scheduling_group_ptr
seastar/src/core/reactor.cc
935:            *internal::current_scheduling_group_ptr() = scheduling_group(tq->_id); # reactor::~reactor()
1219:                *internal::current_scheduling_group_ptr() = t->_sg; # reactor::complete_timers()
1226:    *internal::current_scheduling_group_ptr() = default_scheduling_group(); # reactor::complete_timers()
2133:    *internal::current_scheduling_group_ptr() = scheduling_group(tq._id); # reactor::run_tasks()
2572:    *internal::current_scheduling_group_ptr() = default_scheduling_group(); // Prevent inheritance from last group run # reactor::run_some_tasks()

I don't see how any of these can cause what I see.

@aleksbykov I uploaded another RPM with new assertions added, please try to run it, download it as:

$ gsutil cp gs://scratch.scylladb.com/bdenes/7117/scylla-server-4.2.rc3-0.20200901.54a99043b.x86_64.rpm .

I applied the following change to seastar:

$ git diff
diff --git a/src/core/reactor.cc b/src/core/reactor.cc
index 832a58c8..aebd2617 100644
--- a/src/core/reactor.cc
+++ b/src/core/reactor.cc
@@ -2139,6 +2139,7 @@ void reactor::run_tasks(task_queue& tq) {
         task_histogram_add_task(*tsk);
         _current_task = tsk;
         tsk->run_and_dispose();
+        assert(current_scheduling_group() == scheduling_group(tq._id));
         _current_task = nullptr;
         STAP_PROBE(seastar, reactor_run_tasks_single_end);
         ++tq._tasks_processed;

This should allow us to catch the task overriding internal::current_scheduling_group_ptr().

I tried to fish the previous tasks from tasks but the allocation slot was reused already:

(gdb) p tasks._impl.storage[(tasks._impl.begin % tasks._impl.capacity) - 1]
$2 = (seastar::task *) 0x600002d10640
(gdb) p _current_task
$3 = (seastar::task *) 0x600002d10640
(gdb) p tasks._impl.storage[(tasks._impl.begin % tasks._impl.capacity) - 2]
$4 = (seastar::task *) 0x600008170340
(gdb) scylla ptr 0x600008170340
thread 1, small (size <= 64), live (0x600008170340 +0)
(gdb) x/1a 0x600008170340
0x600008170340: 0x2 # not a task vptr

Of course nothing guarantees that the previous task did the override, it could have been a much earlier task.

@denesb start new one job with your new scylla rpm. Will update once will get a result

@denesb sorry with delay. Was a bit confused with scylla version returned by scylla --version command. But after checking log, found that your later core was installed.
I got next coredumps with on node4 was scylla stopped and started.
Coredump on node4:

2020-09-01 09:53:26.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-368dbcfc-4 [13.53.150.235 | 10.0.0.147] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.1269.1598954006000000/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.1269.1598954006000000.gz
backtrace=           PID: 1269 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Tue 2020-09-01 09:53:26 UTC (1min 22s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 61d4f3f23bcd4b65876bbbb47d4559f6
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-368dbcfc-4
      Coredump: /var/lib/systemd/coredump/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.1269.1598954006000000
       Message: Process 1269 (scylla) of user 996 dumped core.

                Stack trace of thread 1286:
                #0  0x00007f35ac3dc9e5 raise (libc.so.6)
                #1  0x00007f35ac3c594d abort (libc.so.6)
                #2  0x00007f35ac3c5769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f35ac3d4e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c43 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f5f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1be _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabbb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f35ace1e432 start_thread (libpthread.so.0)
                #10 0x00007f35ac4a1913 __clone (libc.so.6)

                Stack trace of thread 1294:
                #0  0x00007f35ace289ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f35ace1e432 start_thread (libpthread.so.0)
                #5  0x00007f35ac4a1913 __clone (libc.so.6)

                Stack trace of thread 1290:
                #0  0x00007f35ace289ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f35ace1e432 start_thread (libpthread.so.0)
                #5  0x00007f35ac4a1913 __clone (libc.so.6)

                Stack trace of thread 1282:
                #0  0x00007f35ace289ac read (libpthread.so.0)
                #1  0x00000000030f6820 _ZN7seastar25task_quota_aio_completion13complete_withEl (scylla)
                #2  0x00000000030f38c3 _ZN7seastar19reactor_backend_aio24reset_preemption_monitorEv (scylla)
                #3  0x0000000002e63dfe _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #4  0x0000000002e9b1be _ZN7seastar7reactor3runEv (scylla)
                #5  0x0000000002eaabbb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #6  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #7  0x00007f35ace1e432 start_thread (libpthread.so.0)
                #8  0x00007f35ac4a1913 __clone (libc.so.6)

                Stack trace of thread 1289:
                #0  0x00007f35ace289ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f35ace1e432 start_thread (libpthread.so.0)
                #5  0x00007f35ac4a1913 __clone (libc.so.6)

                Stack trace of thread 1281:
                #0  0x000000000108a2d4 _ZN16mutation_querier7consumeEO14clustering_row13row_tombstone (scylla)
                #1  0x00000000010cee8a _ZZN20flat_mutation_reader4impl16consume_pausableISt17reference_wrapperINS0_16consumer_adapterI35stable_flattened_mutations_consumerI17compact_for_queryIL19emit_only_live_rows1EN5query27clustering_position_trackerI20query_result_builderEEEEEEEEEN7seastar6futureIJEEET_NSt6chrono10time_pointINSF_12lowres_clockENSJ_8durationIlSt5ratioILl1ELl1000EEEEEEENUlvE0_clEv (scylla)
                #2  0x00000000010d2d07 _ZZN5query12consume_pageIL19emit_only_live_rows1E20query_result_builderEEDaR20flat_mutation_readerN7seastar13lw_shared_ptrI22compact_mutation_stateIXT_EL20compact_for_sstables0EEEERKNS_15partition_sliceEOT0_jjNSt6chrono10time_pointI8gc_clockNSG_8durationIlSt5ratioILl1ELl1EEEEEENSH_INS5_12lowres_clockENSJ_IlSK_ILl1ELl1000EEEEEEmENUlP17mutation_fragmentE_clEST_ (scylla)
                #3  0x00000000010d5503 _ZN5query12consume_pageIL19emit_only_live_rows1E20query_result_builderEEDaR20flat_mutation_readerN7seastar13lw_shared_ptrI22compact_mutat

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.1269.1598954006000000/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.1269.1598954006000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.1269.1598954006000000.gz

Second coredump on node4:

2020-09-01 09:58:56.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-368dbcfc-4 [13.53.150.235 | 10.0.0.147] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.3693.1598954336000000/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.3693.1598954336000000.gz
backtrace=           PID: 3693 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Tue 2020-09-01 09:58:56 UTC (2min 13s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 61d4f3f23bcd4b65876bbbb47d4559f6
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-368dbcfc-4
      Coredump: /var/lib/systemd/coredump/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.3693.1598954336000000
       Message: Process 3693 (scylla) of user 996 dumped core.

                Stack trace of thread 3705:
                #0  0x00007fae4184d9e5 raise (libc.so.6)
                #1  0x00007fae4183694d abort (libc.so.6)
                #2  0x00007fae41836769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007fae41845e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c43 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f5f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1be _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabbb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007fae4228f432 start_thread (libpthread.so.0)
                #10 0x00007fae41912913 __clone (libc.so.6)

                Stack trace of thread 3718:
                #0  0x00007fae422999ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fae4228f432 start_thread (libpthread.so.0)
                #5  0x00007fae41912913 __clone (libc.so.6)

                Stack trace of thread 3719:
                #0  0x00007fae422999ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fae4228f432 start_thread (libpthread.so.0)
                #5  0x00007fae41912913 __clone (libc.so.6)

                Stack trace of thread 3714:
                #0  0x00007fae422999ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fae4228f432 start_thread (libpthread.so.0)
                #5  0x00007fae41912913 __clone (libc.so.6)

                Stack trace of thread 3707:
                #0  0x000000000167f2f0 _ZNK4cql310statements22modification_statement10do_executeERN7service13storage_proxyERNS2_11query_stateERKNS_13query_optionsE (scylla)
                #1  0x000000000168138c _ZN7seastar20noncopyable_functionIFNS_6futureIJNS_10shared_ptrIN13cql_transport8messages14result_messageEEEEEEPKN4cql310statements22modification_statementERN7service13storage_proxyERNSD_11query_stateERKNS8_13query_optionsEEE17direct_vtable_forISt7_Mem_fnIMSA_KFS7_SF_SH_SK_EEE4callEPKSM_SC_SF_SH_SK_ (scylla)
                #2  0x00000000016813ae _ZN7seastar20noncopyable_functionIFNS_6futureIJNS_10shared_ptrIN13cql_transport8messages14result_messageEEEEEEPKN4cql310statements22modification_statementERN7service13storage_proxyERNSD_11query_stateERKNS8_13query_optionsEEE17direct_vtable_forIZNS_35inheriting_concrete_execution_stageIS7_JSC_SF_SH_SK_EE20make_stage_for_groupENS_16scheduling_groupEEUlSC_SF_SH_SK_E_E4callEPKSM_SC_SF_SH_SK_ (scylla)
                #3  0x0000000001683434 _ZN7seastar24concrete_execution_stageINS_6futureIJNS_10shared_ptrIN13cql_transport8messages14result_messageEEEEEEJPKN4cql310statements22modification_statementERN7service13storage_proxyERNSD_11query_stateERKNS8_13query_optionsEEE8do_flushEv (scylla)
                #4  0x0000000002dfa547 _ZN7seastar11lambda_taskIZNS_15execution_stage5flushEvEUlvE_E15run_and_disposeEv (scylla)
                #5  0x0000000002e63bb8 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #6  0x0000000002e63f5f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #7  0x0000000002e9b1be _ZN7seastar7reactor3runEv (scylla)
                #8  0x0000000002eaabbb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #9  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #10 0x00007fae4228f

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.3693.1598954336000000/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.3693.1598954336000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.3693.1598954336000000.gz

and coredumps on node1 and node2:

2020-09-01 10:02:06.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-368dbcfc-1 [13.48.29.203 | 10.0.3.7] (seed: True)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.2d0fcbcba5c1457385eb9fce1b6c5f1a.8174.1598954526000000/core.scylla.996.2d0fcbcba5c1457385eb9fce1b6c5f1a.8174.1598954526000000.gz
backtrace=           PID: 8174 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Tue 2020-09-01 10:02:06 UTC (1min 26s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 2d0fcbcba5c1457385eb9fce1b6c5f1a
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-368dbcfc-1
      Coredump: /var/lib/systemd/coredump/core.scylla.996.2d0fcbcba5c1457385eb9fce1b6c5f1a.8174.1598954526000000
       Message: Process 8174 (scylla) of user 996 dumped core.

                Stack trace of thread 8176:
                #0  0x00007f6b00db09e5 raise (libc.so.6)
                #1  0x00007f6b00d9994d abort (libc.so.6)
                #2  0x00007f6b00d99769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f6b00da8e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c43 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f5f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1be _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabbb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f6b017f2432 start_thread (libpthread.so.0)
                #10 0x00007f6b00e75913 __clone (libc.so.6)

                Stack trace of thread 8185:
                #0  0x00007f6b017fc9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6b017f2432 start_thread (libpthread.so.0)
                #5  0x00007f6b00e75913 __clone (libc.so.6)

                Stack trace of thread 8189:
                #0  0x00007f6b017fc9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6b017f2432 start_thread (libpthread.so.0)
                #5  0x00007f6b00e75913 __clone (libc.so.6)

                Stack trace of thread 8187:
                #0  0x00007f6b017fc9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6b017f2432 start_thread (libpthread.so.0)
                #5  0x00007f6b00e75913 __clone (libc.so.6)

                Stack trace of thread 8190:
                #0  0x00007f6b017fc9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6b017f2432 start_thread (libpthread.so.0)
                #5  0x00007f6b00e75913 __clone (libc.so.6)

                Stack trace of thread 8183:
                #0  0x00007f6b017fc9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6b017f2432 start_thread (libpthread.so.0)
                #5  0x00007f6b00e75913 __clone (libc.so.6)

                Stack trace of thread 8186:
                #0  0x00007f6b017fc9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6b017f2432 start_thread (libpthread.so.0)
                #5  0x00007f6b00e75913 __clone (libc.so.6)

                Stack trace of thread 8184:
                #0  0x00007f6b017fc9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.2d0fcbcba5c1457385eb9fce1b6c5f1a.8174.1598954526000000/core.scylla.996.2d0fcbcba5c1457385eb9fce1b6c5f1a.8174.1598954526000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.2d0fcbcba5c1457385eb9fce1b6c5f1a.8174.1598954526000000.gz
2020-09-01 10:02:04.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-368dbcfc-2 [13.53.198.191 | 10.0.1.192] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.6a68503cdf304bfeb362b1e70360f5de.8018.1598954524000000/core.scylla.996.6a68503cdf304bfeb362b1e70360f5de.8018.1598954524000000.gz
backtrace=           PID: 8018 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Tue 2020-09-01 10:02:04 UTC (2min 27s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 6a68503cdf304bfeb362b1e70360f5de
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-368dbcfc-2
      Coredump: /var/lib/systemd/coredump/core.scylla.996.6a68503cdf304bfeb362b1e70360f5de.8018.1598954524000000
       Message: Process 8018 (scylla) of user 996 dumped core.

                Stack trace of thread 8023:
                #0  0x00007f3ce3f209e5 raise (libc.so.6)
                #1  0x00007f3ce3f0994d abort (libc.so.6)
                #2  0x00007f3ce3f09769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f3ce3f18e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c43 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f5f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1be _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabbb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f3ce4962432 start_thread (libpthread.so.0)
                #10 0x00007f3ce3fe5913 __clone (libc.so.6)

                Stack trace of thread 8029:
                #0  0x00007f3ce496c9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f3ce4962432 start_thread (libpthread.so.0)
                #5  0x00007f3ce3fe5913 __clone (libc.so.6)

                Stack trace of thread 8028:
                #0  0x00007f3ce496c9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f3ce4962432 start_thread (libpthread.so.0)
                #5  0x00007f3ce3fe5913 __clone (libc.so.6)

                Stack trace of thread 8027:
                #0  0x00007f3ce496c9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f3ce4962432 start_thread (libpthread.so.0)
                #5  0x00007f3ce3fe5913 __clone (libc.so.6)

                Stack trace of thread 8030:
                #0  0x00007f3ce496c9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f3ce4962432 start_thread (libpthread.so.0)
                #5  0x00007f3ce3fe5913 __clone (libc.so.6)

                Stack trace of thread 8033:
                #0  0x00007f3ce496c9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f3ce4962432 start_thread (libpthread.so.0)
                #5  0x00007f3ce3fe5913 __clone (libc.so.6)

                Stack trace of thread 8032:
                #0  0x00007f3ce496c9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f3ce4962432 start_thread (libpthread.so.0)
                #5  0x00007f3ce3fe5913 __clone (libc.so.6)

                Stack trace of thread 8031:
                #0  0x00007f3ce496c9ac read (libpthread.so.0)
                #1  0x00000000030f2a07 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2c68 _ZNSt17_Function_handlerIFvvEZN7seastar11thread

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.6a68503cdf304bfeb362b1e70360f5de.8018.1598954524000000/core.scylla.996.6a68503cdf304bfeb362b1e70360f5de.8018.1598954524000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.6a68503cdf304bfeb362b1e70360f5de.8018.1598954524000000.gz

Same situation in the new core (core.scylla.996.61d4f3f23bcd4b65876bbbb47d4559f6.1269.1598954006000000):

(gdb) p (scheduling_group)'seastar::internal::current_scheduling_group_ptr()::sg '
$2 = {
  _id = 0
}
(gdb) p tq._id
$3 = 5 '\005'
(gdb) p _current_task->_id
There is no member or method named _id.
(gdb) p _current_task->_sg
$4 = {
  _id = 5
}

Hopefully this time the assert is soon enough after the "deed" that I can catch the perpetrator red handed.

Interesting:

(gdb) p tasks
$35 = (seastar::circular_buffer<seastar::task*, std::allocator<seastar::task*> > &) @0x60700007e9b0: {
  _impl = {
    <std::allocator<seastar::task*>> = {
      <__gnu_cxx::new_allocator<seastar::task*>> = {<No data fields>}, <No data fields>}, 
    members of seastar::circular_buffer<seastar::task*, std::allocator<seastar::task*> >::impl:
    storage = 0x607009010000,
    begin = 8213,
    end = 9416,
    capacity = 2048
  }
}
(gdb) p _current_task
$39 = (seastar::task *) 0x607008a5fa00
(gdb) find /g 0x607009010000, 0x607009014000, 0x607008a5fa00
Pattern not found.

The _current_task doesn't seem to be from the current task queue.

Doesn't seem to have a vtable either:

(gdb) x/1a _current_task
0x607008a5fa00: 0x607008dbe460

Oh, it's free:

(gdb) scylla ptr _current_task
thread 1, small (size <= 80), free (0x607008a5fa00 +0)

run_and_dispose() does delete this.

We need to implement freed object quarantining -- serve new allocations from the tail of the freelist, so freshly freed objects can be inspected. This would also reduce the chances of memory corruption (given a use-after-free -- which we of course shouldn't have), as new objects would overwrite the oldest freed objects.

LIFO is faster, as it keeps caches hot.

Yes, you are right.

I applied the following change to seastar:

$ git diff
diff --git a/src/core/reactor.cc b/src/core/reactor.cc
index 832a58c8..aebd2617 100644
--- a/src/core/reactor.cc
+++ b/src/core/reactor.cc
@@ -2139,6 +2139,7 @@ void reactor::run_tasks(task_queue& tq) {
         task_histogram_add_task(*tsk);
         _current_task = tsk;
         tsk->run_and_dispose();
+        assert(current_scheduling_group() == scheduling_group(tq._id));
         _current_task = nullptr;
         STAP_PROBE(seastar, reactor_run_tasks_single_end);
         ++tq._tasks_processed;

This should allow us to catch the task overriding internal::current_scheduling_group_ptr().

For debugging, you can capture the task's vptr+sg before execution, so you can inspect it after the failure.

Yes, that's what I was planning, was just considering my other options to avoid another round-trip if possible. But maybe that's a waste of time, this seems to reproduce quite reliable, so another round shouldn't be to bad.

Trying this change now:

diff --git a/include/seastar/core/reactor.hh b/include/seastar/core/reactor.hh
index c3757070..414f7904 100644
--- a/include/seastar/core/reactor.hh
+++ b/include/seastar/core/reactor.hh
@@ -321,6 +321,8 @@ class reactor {
     task_queue* _at_destroy_tasks;
     sched_clock::duration _task_quota;
     task* _current_task = nullptr;
+    scheduling_group _current_task_sg;
+    uintptr_t _current_task_vptr = 0;
     /// Handler that will be called when there is no task to execute on cpu.
     /// It represents a low priority work.
     /// 
diff --git a/src/core/reactor.cc b/src/core/reactor.cc
index aebd2617..f49b13bf 100644
--- a/src/core/reactor.cc
+++ b/src/core/reactor.cc
@@ -2138,6 +2138,8 @@ void reactor::run_tasks(task_queue& tq) {
         STAP_PROBE(seastar, reactor_run_tasks_single_start);
         task_histogram_add_task(*tsk);
         _current_task = tsk;
+        _current_task_sg = tsk->group();
+        std::copy_n(reinterpret_cast<char*>(tsk), sizeof(uintptr_t), reinterpret_cast<char*>(&_current_task_vptr));
         tsk->run_and_dispose();
         assert(current_scheduling_group() == scheduling_group(tq._id));
         _current_task = nullptr;

RPM is coming soon.

You can also use &typeid(tsk) to capture the vptr. Of course for a hack this is fine.

In fact you may want to memcpy 200 bytes from tsk, not just 8, in case there's something interesting in there.

Good point, let's make it 256 so its a round number.

New RPM uploaded, download as:

$ gsutil cp gs://scratch.scylladb.com/bdenes/7117/scylla-server-4.2.rc3-0.20200901.eb863d01a.x86_64.rpm .

Patch:

diff --git a/include/seastar/core/reactor.hh b/include/seastar/core/reactor.hh
index c3757070..fddbad8a 100644
--- a/include/seastar/core/reactor.hh
+++ b/include/seastar/core/reactor.hh
@@ -321,6 +321,8 @@ class reactor {
     task_queue* _at_destroy_tasks;
     sched_clock::duration _task_quota;
     task* _current_task = nullptr;
+    scheduling_group _current_task_sg;
+    std::array<char, 256> _current_task_content;
     /// Handler that will be called when there is no task to execute on cpu.
     /// It represents a low priority work.
     /// 
diff --git a/src/core/reactor.cc b/src/core/reactor.cc
index aebd2617..d4f9bffa 100644
--- a/src/core/reactor.cc
+++ b/src/core/reactor.cc
@@ -2138,6 +2138,8 @@ void reactor::run_tasks(task_queue& tq) {
         STAP_PROBE(seastar, reactor_run_tasks_single_start);
         task_histogram_add_task(*tsk);
         _current_task = tsk;
+        _current_task_sg = tsk->group();
+        std::copy_n(reinterpret_cast<char*>(tsk), _current_task_content.size(), _current_task_content.data());
         tsk->run_and_dispose();
         assert(current_scheduling_group() == scheduling_group(tq._id));
         _current_task = nullptr;

@aleksbykov please run the reproducer using this RPM.

@denesb maybe it's tq->_sg that got corrupted (low chance, but worth to check)

In the original core tq->_sg agreeed with task::_sg. I think the chance of both getting corrupted at the same time is low (but not 0). scylla task-queues gives sane results for all cores I checked -- no two task queues have the same sched group.

Job with new scylla rpm is running. WIll update with results

@denesb New portion of cordumps

2020-09-02 08:43:37.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 [13.53.38.115 | 10.0.3.79] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.1288.1599036217000000/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.1288.1599036217000000.gz
backtrace=           PID: 1288 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Wed 2020-09-02 08:43:37 UTC (2min 9s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: d0c9a6414e9543b8b1a49a9d47349546
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4
      Coredump: /var/lib/systemd/coredump/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.1288.1599036217000000
       Message: Process 1288 (scylla) of user 996 dumped core.

                Stack trace of thread 1303:
                #0  0x00007f2b120109e5 raise (libc.so.6)
                #1  0x00007f2b11ff994d abort (libc.so.6)
                #2  0x00007f2b11ff9769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f2b12008e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f2b12a52432 start_thread (libpthread.so.0)
                #10 0x00007f2b120d5913 __clone (libc.so.6)

                Stack trace of thread 1309:
                #0  0x00007f2b12a5c9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f2b12a52432 start_thread (libpthread.so.0)
                #5  0x00007f2b120d5913 __clone (libc.so.6)

                Stack trace of thread 1314:
                #0  0x00007f2b12a5c9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f2b12a52432 start_thread (libpthread.so.0)
                #5  0x00007f2b120d5913 __clone (libc.so.6)

                Stack trace of thread 1313:
                #0  0x00007f2b12a5c9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f2b12a52432 start_thread (libpthread.so.0)
                #5  0x00007f2b120d5913 __clone (libc.so.6)

                Stack trace of thread 1310:
                #0  0x00007f2b12a5c9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f2b12a52432 start_thread (libpthread.so.0)
                #5  0x00007f2b120d5913 __clone (libc.so.6)

                Stack trace of thread 1316:
                #0  0x00007f2b12a5c9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f2b12a52432 start_thread (libpthread.so.0)
                #5  0x00007f2b120d5913 __clone (libc.so.6)

                Stack trace of thread 1305:
                #0  0x00007f2b12a5d77d sendmsg (libpthread.so.0)
                #1  0x0000000002e754ff _ZN7seastar12continuationINS_8internal22promise_base_with_typeIJmEEEZNS_7reactor13do_write_someERNS_17pollable_fd_stateERNS_3net6packetEEUlvE_ZZNS_6futureIJEE14then_impl_nrvoISA_NSB_IJmEEEEET0_OT_ENKUlvE_clEvEUlRS3_RSA_ONS_12future_stateIJEEEE_JEE15run_and_disposeEv (scylla)
                #2  0x0000000002e63bd6 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #3  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #4  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #5  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #6  0x0000000002e2ec7e _

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.1288.1599036217000000/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.1288.1599036217000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.1288.1599036217000000.gz
2020-09-02 08:46:45.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-1 [13.53.119.80 | 10.0.3.175] (seed: True)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.c607b82f1e554fc3ae6b99bbe1535351.8155.1599036405000000/core.scylla.996.c607b82f1e554fc3ae6b99bbe1535351.8155.1599036405000000.gz
backtrace=           PID: 8155 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Wed 2020-09-02 08:46:45 UTC (1min 27s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: c607b82f1e554fc3ae6b99bbe1535351
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-1
      Coredump: /var/lib/systemd/coredump/core.scylla.996.c607b82f1e554fc3ae6b99bbe1535351.8155.1599036405000000
       Message: Process 8155 (scylla) of user 996 dumped core.

                Stack trace of thread 8157:
                #0  0x00007f6dac1019e5 raise (libc.so.6)
                #1  0x00007f6dac0ea94d abort (libc.so.6)
                #2  0x00007f6dac0ea769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f6dac0f9e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f6dacb43432 start_thread (libpthread.so.0)
                #10 0x00007f6dac1c6913 __clone (libc.so.6)

                Stack trace of thread 8179:
                #0  0x00007f6dacb4d9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6dacb43432 start_thread (libpthread.so.0)
                #5  0x00007f6dac1c6913 __clone (libc.so.6)

                Stack trace of thread 8178:
                #0  0x00007f6dacb4d9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6dacb43432 start_thread (libpthread.so.0)
                #5  0x00007f6dac1c6913 __clone (libc.so.6)

                Stack trace of thread 8176:
                #0  0x00007f6dacb4d9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6dacb43432 start_thread (libpthread.so.0)
                #5  0x00007f6dac1c6913 __clone (libc.so.6)

                Stack trace of thread 8183:
                #0  0x00007f6dacb4d9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6dacb43432 start_thread (libpthread.so.0)
                #5  0x00007f6dac1c6913 __clone (libc.so.6)

                Stack trace of thread 8181:
                #0  0x00007f6dacb4d9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6dacb43432 start_thread (libpthread.so.0)
                #5  0x00007f6dac1c6913 __clone (libc.so.6)

                Stack trace of thread 8180:
                #0  0x00007f6dacb4d9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f6dacb43432 start_thread (libpthread.so.0)
                #5  0x00007f6dac1c6913 __clone (libc.so.6)

                Stack trace of thread 8177:
                #0  0x00007f6dacb4d9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.c607b82f1e554fc3ae6b99bbe1535351.8155.1599036405000000/core.scylla.996.c607b82f1e554fc3ae6b99bbe1535351.8155.1599036405000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.c607b82f1e554fc3ae6b99bbe1535351.8155.1599036405000000.gz
2020-09-02 08:46:43.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-2 [13.48.55.27 | 10.0.2.60] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.8003.1599036403000000/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.8003.1599036403000000.gz
backtrace=           PID: 8003 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Wed 2020-09-02 08:46:43 UTC (2min 16s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /scylla.slice/scylla-server.slice/scylla-server.service
          Unit: scylla-server.service
         Slice: scylla-server.slice
       Boot ID: 1ceb4383a0b4486792e06cce7f58e440
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-2
      Coredump: /var/lib/systemd/coredump/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.8003.1599036403000000
       Message: Process 8003 (scylla) of user 996 dumped core.

                Stack trace of thread 8003:
                #0  0x00007f827c8d99e5 raise (libc.so.6)
                #1  0x00007f827c8c294d abort (libc.so.6)
                #2  0x00007f827c8c2769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f827c8d1e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002df7ddb _ZN7seastar12app_template14run_deprecatedEiPPcOSt8functionIFvvEE (scylla)
                #8  0x0000000002df848f _ZN7seastar12app_template3runEiPPcOSt8functionIFNS_6futureIJiEEEvEE (scylla)
                #9  0x0000000000d9f251 main (scylla)
                #10 0x00007f827c8c4042 __libc_start_main (libc.so.6)
                #11 0x0000000000cbbb4e _start (scylla)

                Stack trace of thread 8027:
                #0  0x00007f827d3259ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f827d31b432 start_thread (libpthread.so.0)
                #5  0x00007f827c99e913 __clone (libc.so.6)

                Stack trace of thread 8029:
                #0  0x00007f827d3259ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f827d31b432 start_thread (libpthread.so.0)
                #5  0x00007f827c99e913 __clone (libc.so.6)

                Stack trace of thread 8005:
                #0  0x00007f827c92404b __lll_lock_wait_private (libc.so.6)
                #1  0x00007f827c8c28de abort (libc.so.6)
                #2  0x00007f827c8c2769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f827c8d1e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f827d31b432 start_thread (libpthread.so.0)
                #10 0x00007f827c99e913 __clone (libc.so.6)

                Stack trace of thread 8007:
                #0  0x00007f827c99937d syscall (libc.so.6)
                #1  0x00000000030f9739 _ZN7seastar8internal9io_submitEmlPPNS0_9linux_abi4iocbE (scylla)
                #2  0x00000000030f3996 _ZN7seastar19reactor_backend_aio24reset_preemption_monitorEv (scylla)
                #3  0x0000000002e63c51 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #4  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #5  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #6  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #7  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #8  0x00007f827d31b432 start_thread (libpthread.so.0)
                #9  0x00007f827c99e913 __clone (libc.so.6)

                Stack trace of thread 8028:
                #0  0x00007f827d3259ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f827d31b432 start_thread (libpthread.so.0)
                #5  0x00007f827c99e913 __clone (libc.so.6)

                Stack trace of thread 8009:
                #0  0x00007f827ca77592 uw_update_context_1 (libgcc_s.so.1)
                #1  0x00007f827ca780d3 _Unwind_RaiseException (libgcc_s.so.1)
                #2  0x00007f827ca786f9 _Unwind_Resume_or_Rethrow (libgcc_s.so.1)
                #

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.8003.1599036403000000/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.8003.1599036403000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.8003.1599036403000000.gz
2020-09-02 08:46:44.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-3 [13.48.106.137 | 10.0.1.45] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.cf887550cc7841128fc6e56345b99f61.7958.1599036404000000/core.scylla.996.cf887550cc7841128fc6e56345b99f61.7958.1599036404000000.gz
backtrace=           PID: 7958 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Wed 2020-09-02 08:46:44 UTC (2min 16s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: cf887550cc7841128fc6e56345b99f61
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-3
      Coredump: /var/lib/systemd/coredump/core.scylla.996.cf887550cc7841128fc6e56345b99f61.7958.1599036404000000
       Message: Process 7958 (scylla) of user 996 dumped core.

                Stack trace of thread 7960:
                #0  0x00007fa1044e89e5 raise (libc.so.6)
                #1  0x00007fa1044d194d abort (libc.so.6)
                #2  0x00007fa1044d1769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007fa1044e0e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007fa104f2a432 start_thread (libpthread.so.0)
                #10 0x00007fa1045ad913 __clone (libc.so.6)

                Stack trace of thread 7986:
                #0  0x00007fa104f349ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fa104f2a432 start_thread (libpthread.so.0)
                #5  0x00007fa1045ad913 __clone (libc.so.6)

                Stack trace of thread 7985:
                #0  0x00007fa104f349ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fa104f2a432 start_thread (libpthread.so.0)
                #5  0x00007fa1045ad913 __clone (libc.so.6)

                Stack trace of thread 7982:
                #0  0x00007fa104f349ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fa104f2a432 start_thread (libpthread.so.0)
                #5  0x00007fa1045ad913 __clone (libc.so.6)

                Stack trace of thread 7967:
                #0  0x00007fa104f349ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fa104f2a432 start_thread (libpthread.so.0)
                #5  0x00007fa1045ad913 __clone (libc.so.6)

                Stack trace of thread 7981:
                #0  0x00007fa104f349ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fa104f2a432 start_thread (libpthread.so.0)
                #5  0x00007fa1045ad913 __clone (libc.so.6)

                Stack trace of thread 7983:
                #0  0x00007fa104f349ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fa104f2a432 start_thread (libpthread.so.0)
                #5  0x00007fa1045ad913 __clone (libc.so.6)

                Stack trace of thread 7966:
                #0  0x00007fa10453304b __lll_lock_wait_private (libc.so.6)
                #1  0x00007fa1044d18de abort (libc.so.6)
                #2  0x00007fa1044d1769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007fa1044e0e76 __assert_fail (libc.so.6)
                #4

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.cf887550cc7841128fc6e56345b99f61.7958.1599036404000000/core.scylla.996.cf887550cc7841128fc6e56345b99f61.7958.1599036404000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.cf887550cc7841128fc6e56345b99f61.7958.1599036404000000.gz
2020-09-02 08:53:52.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-2 [13.48.55.27 | 10.0.2.60] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.24937.1599036832000000/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.24937.1599036832000000.gz
backtrace=           PID: 24937 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Wed 2020-09-02 08:53:52 UTC (1min 33s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: 1ceb4383a0b4486792e06cce7f58e440
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-2
      Coredump: /var/lib/systemd/coredump/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.24937.1599036832000000
       Message: Process 24937 (scylla) of user 996 dumped core.

                Stack trace of thread 24944:
                #0  0x00007f078b6ca9e5 raise (libc.so.6)
                #1  0x00007f078b6b394d abort (libc.so.6)
                #2  0x00007f078b6b3769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f078b6c2e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f078c10c432 start_thread (libpthread.so.0)
                #10 0x00007f078b78f913 __clone (libc.so.6)

                Stack trace of thread 24948:
                #0  0x00007f078c1169ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f078c10c432 start_thread (libpthread.so.0)
                #5  0x00007f078b78f913 __clone (libc.so.6)

                Stack trace of thread 24965:
                #0  0x00007f078c1169ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f078c10c432 start_thread (libpthread.so.0)
                #5  0x00007f078b78f913 __clone (libc.so.6)

                Stack trace of thread 24961:
                #0  0x00007f078c1169ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f078c10c432 start_thread (libpthread.so.0)
                #5  0x00007f078b78f913 __clone (libc.so.6)

                Stack trace of thread 24960:
                #0  0x00007f078c1169ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f078c10c432 start_thread (libpthread.so.0)
                #5  0x00007f078b78f913 __clone (libc.so.6)

                Stack trace of thread 24964:
                #0  0x00007f078c1169ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f078c10c432 start_thread (libpthread.so.0)
                #5  0x00007f078b78f913 __clone (libc.so.6)

                Stack trace of thread 24963:
                #0  0x00007f078c1169ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f078c10c432 start_thread (libpthread.so.0)
                #5  0x00007f078b78f913 __clone (libc.so.6)

                Stack trace of thread 24962:
                #0  0x00007f078c1169ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seasta

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.24937.1599036832000000/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.24937.1599036832000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.24937.1599036832000000.gz
2020-09-02 08:53:28.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 [13.53.38.115 | 10.0.3.79] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.4725.1599036808000000/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.4725.1599036808000000.gz
backtrace=           PID: 4725 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Wed 2020-09-02 08:53:28 UTC (2min 13s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /scylla.slice/scylla-server.slice/scylla-server.service
          Unit: scylla-server.service
         Slice: scylla-server.slice
       Boot ID: d0c9a6414e9543b8b1a49a9d47349546
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4
      Coredump: /var/lib/systemd/coredump/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.4725.1599036808000000
       Message: Process 4725 (scylla) of user 996 dumped core.

                Stack trace of thread 4725:
                #0  0x00007f553aa809e5 raise (libc.so.6)
                #1  0x00007f553aa6994d abort (libc.so.6)
                #2  0x00007f553aa69769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f553aa78e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002df7ddb _ZN7seastar12app_template14run_deprecatedEiPPcOSt8functionIFvvEE (scylla)
                #8  0x0000000002df848f _ZN7seastar12app_template3runEiPPcOSt8functionIFNS_6futureIJiEEEvEE (scylla)
                #9  0x0000000000d9f251 main (scylla)
                #10 0x00007f553aa6b042 __libc_start_main (libc.so.6)
                #11 0x0000000000cbbb4e _start (scylla)

                Stack trace of thread 4739:
                #0  0x00007f553b4cc9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f553b4c2432 start_thread (libpthread.so.0)
                #5  0x00007f553ab45913 __clone (libc.so.6)

                Stack trace of thread 4735:
                #0  0x00007f553b4cc9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f553b4c2432 start_thread (libpthread.so.0)
                #5  0x00007f553ab45913 __clone (libc.so.6)

                Stack trace of thread 4740:
                #0  0x00007f553b4cc9ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f553b4c2432 start_thread (libpthread.so.0)
                #5  0x00007f553ab45913 __clone (libc.so.6)

                Stack trace of thread 4732:
                #0  0x0000000002ec0069 _ZN7seastar5timerINS_12lowres_clockEE3armENSt6chrono10time_pointIS1_NS3_8durationIlSt5ratioILl1ELl1000EEEEEESt8optionalIS8_E (scylla)
                #1  0x0000000001a750bf _ZN7service22abstract_read_executor7executeENSt6chrono10time_pointIN7seastar12lowres_clockENS1_8durationIlSt5ratioILl1ELl1000EEEEEE (scylla)
                #2  0x00000000019e4bc0 _ZN7service13storage_proxy14query_singularEN7seastar13lw_shared_ptrIN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISA_EEN2db17consistency_levelENS0_25coordinator_query_optionsE (scylla)
                #3  0x00000000019f1828 _ZN7service13storage_proxy8do_queryEN7seastar13lw_shared_ptrIK6schemaEENS2_IN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISD_EEN2db17consistency_levelENS0_25coordinator_query_optionsE (scylla)
                #4  0x00000000019f3328 _ZN7service13storage_proxy5queryEN7seastar13lw_shared_ptrIK6schemaEENS2_IN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISD_EEN2db17consistency_levelENS0_25coordinator_query_optionsE (scylla)
                #5  0x00000000016ed779 _ZNK4cql310statements16select_statement7executeERN7service13storage_proxyEN7seastar13lw_shared_ptrIN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positionEESaISE_EERNS2_11query_stateERKNS_13query_optionsENSt6chrono10time_pointI8gc_clockNSN_8durationIlSt5ratioILl1ELl1EEEEEE (scylla)
                #6  0x00000000016ef46f _ZNK4cql310statements16select_statement10do_executeERN7service13storage_proxyERNS2_11query_stateERKNS_13query_optionsE (scylla)
                #7  0x000000000170bcfc _ZN7seastar20noncopyable_functionIFNS_6futureIJNS_10shared_ptrIN13cql_transport8messages14result_messageEEEEEEPKN4cql310statements16select_statementE

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.4725.1599036808000000/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.4725.1599036808000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.4725.1599036808000000.gz
2020-09-02 08:53:52.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-3 [13.48.106.137 | 10.0.1.45] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.cf887550cc7841128fc6e56345b99f61.24506.1599036832000000/core.scylla.996.cf887550cc7841128fc6e56345b99f61.24506.1599036832000000.gz
backtrace=           PID: 24506 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Wed 2020-09-02 08:53:52 UTC (2min 25s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: cf887550cc7841128fc6e56345b99f61
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-3
      Coredump: /var/lib/systemd/coredump/core.scylla.996.cf887550cc7841128fc6e56345b99f61.24506.1599036832000000
       Message: Process 24506 (scylla) of user 996 dumped core.

                Stack trace of thread 24510:
                #0  0x00007fc1f49ac9e5 raise (libc.so.6)
                #1  0x00007fc1f499594d abort (libc.so.6)
                #2  0x00007fc1f4995769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007fc1f49a4e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007fc1f53ee432 start_thread (libpthread.so.0)
                #10 0x00007fc1f4a71913 __clone (libc.so.6)

                Stack trace of thread 24518:
                #0  0x00007fc1f53f89ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fc1f53ee432 start_thread (libpthread.so.0)
                #5  0x00007fc1f4a71913 __clone (libc.so.6)

                Stack trace of thread 24514:
                #0  0x00007fc1f53f89ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fc1f53ee432 start_thread (libpthread.so.0)
                #5  0x00007fc1f4a71913 __clone (libc.so.6)

                Stack trace of thread 24517:
                #0  0x00007fc1f53f89ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fc1f53ee432 start_thread (libpthread.so.0)
                #5  0x00007fc1f4a71913 __clone (libc.so.6)

                Stack trace of thread 24515:
                #0  0x00007fc1f53f89ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fc1f53ee432 start_thread (libpthread.so.0)
                #5  0x00007fc1f4a71913 __clone (libc.so.6)

                Stack trace of thread 24520:
                #0  0x00007fc1f53f89ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fc1f53ee432 start_thread (libpthread.so.0)
                #5  0x00007fc1f4a71913 __clone (libc.so.6)

                Stack trace of thread 24516:
                #0  0x00007fc1f53f89ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fc1f53ee432 start_thread (libpthread.so.0)
                #5  0x00007fc1f4a71913 __clone (libc.so.6)

                Stack trace of thread 24519:
                #0  0x00007fc1f53f89ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seasta

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.cf887550cc7841128fc6e56345b99f61.24506.1599036832000000/core.scylla.996.cf887550cc7841128fc6e56345b99f61.24506.1599036832000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.cf887550cc7841128fc6e56345b99f61.24506.1599036832000000.gz
(gdb) p (scheduling_group)'seastar::internal::current_scheduling_group_ptr()::sg'
$1 = {
  _id = 0
}
(gdb) p tq._id
$2 = 5 '\005'
(gdb) p tsk
$3 = (seastar::task *) 0x603001538780
(gdb) p _current_task
$4 = (seastar::task *) 0x603001538780
(gdb) p _current_task_sg 
$5 = {
  _id = 5
}
(gdb) p/x (uint64_t*)_current_task_content._M_elems 
$6 = 0x603000077944
(gdb) x/1a 0x603000077944
0x603000077944: 0x4b8eb8 <_ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIJEEEZN7service13storage_proxy15mutate_internalIN5boost14iterator_rangeIN9__gnu_cxx17__normal_iteratorIP8mutationSt6vectorISB_SaISB_EEEEEEEENS_6futureIJEEET_N2db17consistency_levelEbN7tracing15trace_state_ptrE14service_permitSt8optionalINSt6chrono10time_pointINS_12lowres_clockENSR_8durationIlSt5ratioILl1ELl1000EEEEEEENS_13lw_shared_ptrIN3cdc24operation_result_trackerEEEEUlSJ_E0_ZZNSJ_17then_wrapped_nrvoISJ_S14_EENS_8futurizeISK_E4typeEOT0_ENKUlvE_clEvEUlRS3_RS14_ONS_12future_stateIJEEEE_JEEE+16>

@aleksbykov do the cores happen very soon after the start or does it take some time?

The fact that it's a zero is suspicious. We don't have many tasks with sg 0.

So maybe it's a memory overrun. Look for thread local variables near that address.

Good suggestion. I'm now downloading more cores to see if there is a pattern in the current task.

Yes, collecting more data is a good idea.

There is no pattern on the _current_task. But current_scheduling_group_ptr() is always 0 (default/main) and the current group is always 5 (statement). This could be simply because statement has the most tasks.

I now regret not adding a flag of whether this is the first task being run from the queue or not.

It would be awesome if we could get this working here: https://rr-project.org/

Trying to determine the identity of the TLS variable preceding seastar::internal::current_scheduling_group_ptr()::sg was fruitless so far.

@aleksbykov do the cores happen very soon after the start or does it take some time?

@denesb sorry, missed your question. Here is recs from logs with timestamps:

2020-09-02T08:40:15+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: [shard 0] thrift_controller - Thrift server listening on 10.0.3.79:9160 ...
2020-09-02T08:40:15+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: [shard 0] init - serving
2020-09-02T08:40:15+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: [shard 0] init - Scylla version 4.2.rc3-0.20200827.48d79a1d9 initialization completed.
2020-09-02T08:42:45+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: [shard 3] rpc - client 10.0.1.45:61867 msg_id 66663:  exception "Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE." in no_wait handler ignored
2020-09-02T08:42:45+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !WARNING | scylla: [shard 3] storage_proxy - Failed to apply mutation from 10.0.1.45#3: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-02T08:43:17+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !WARNING | scylla: [shard 3] storage_proxy - Failed to apply mutation from 10.0.1.45#3: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-02T08:43:17+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !WARNING | scylla: [shard 3] storage_proxy - Failed to apply mutation from 10.0.1.45#3: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-02T08:43:34+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !WARNING | scylla: [shard 3] storage_proxy - Failed to apply mutation from 10.0.1.45#3: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-02T08:43:34+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !WARNING | scylla: [shard 3] storage_proxy - Failed to apply mutation from 10.0.1.45#3: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: [shard 3] rpc - client 10.0.3.175:65115 msg_id 109280:  exception "Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE." in no_wait handler ignored
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !WARNING | scylla: [shard 3] storage_proxy - Failed to apply mutation from 10.0.3.175#3: exceptions::mutation_write_failure_exception (Operation failed for system.paxos - received 0 responses and 1 failures from 1 CL=ONE.)
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !WARNING | scylla: [shard 3] storage_proxy - Failed to apply mutation from 10.0.3.175#3: exceptions::mutation_write_failure_exception (Operation failed for system.paxos - received 0 responses and 1 failures from 1 CL=ONE.)
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !WARNING | scylla: [shard 3] storage_proxy - Failed to apply mutation from 10.0.3.175#3: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: scylla: /ScyllaDB/scylla2/seastar/src/core/reactor.cc:2144: void seastar::reactor::run_tasks(seastar::reactor::task_queue&): Assertion `current_scheduling_group() == scheduling_group(tq._id)' failed.
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: Aborting on shard 3.
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: Backtrace:
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002ec2492
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002e66860
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002e66b05
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002e66b50
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x00007f2b12a5da8f
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000025894
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000025768
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000034e75
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002e63c6b
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002e63f7e
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002e9b1ed
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002eaabfa
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: 0x0000000002e2ec7d
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: /opt/scylladb/libreloc/libpthread.so.0+0x0000000000009431
2020-09-02T08:43:37+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-73f9b86a-4 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000101912

This is the immediate neighbourhood of seastar::internal::current_scheduling_group_ptr()::sg:

000000000000090      1 TLS     LOCAL  DEFAULT       23 __tls_guard
000000000000098     56 TLS     LOCAL  HIDDEN        23 type_interning_helper<user_type_impl, seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >, std::vector<seastar::shared_ptr<abstract_type const>, std::allocator<seastar::shared_ptr<abstract_type const> > >, bool>::_instances
0000000000000d0     56 TLS     LOCAL  HIDDEN        23 type_interning_helper<reversed_type_impl, seastar::shared_ptr<abstract_type const> >::_instances
000000000000108      4 TLS     LOCAL  HIDDEN        23 seastar::internal::this_shard_id_ptr()::g_this_shard_id
00000000000010c      4 TLS     LOCAL  HIDDEN        23 seastar::internal::current_scheduling_group_ptr()::sg
000000000000110      1 TLS     LOCAL  DEFAULT       23 __tls_guard
000000000000120   1056 TLS     LOCAL  HIDDEN        23 default_dirty_memory_manager
000000000000540      8 TLS     LOCAL  HIDDEN        23 guard variable for get_standard_migrator<blob_storage>()::instance
000000000000548     32 TLS     LOCAL  HIDDEN        23 get_standard_migrator<blob_storage>()::instance
000000000000568      1 TLS     LOCAL  DEFAULT       23 __tls_guard

Confirmed by gdb:

(gdb) p &'seastar::internal::current_scheduling_group_ptr()::sg'
$21 = (<thread local variable, no debug info> *) 0x7fc1f11e7c0c
(gdb) p &'seastar::internal::this_shard_id_ptr()::g_this_shard_id'
$22 = (<thread local variable, no debug info> *) 0x7fc1f11e7c08
(gdb) p &default_dirty_memory_manager
$23 = (dirty_memory_manager *) 0x7fc1f11e7c20

We write the shard id once on startup.

It would be awesome if we could get this working here: https://rr-project.org/

I think it's unlikely to work well, it multiplexes all threads onto a single thread in order to prevent races from causing indeterminism (haven't checked, but it must). So it will only work on simple cases.

This is the immediate neighbourhood of seastar::internal::current_scheduling_group_ptr()::sg:

000000000000090      1 TLS     LOCAL  DEFAULT       23 __tls_guard
000000000000098     56 TLS     LOCAL  HIDDEN        23 type_interning_helper<user_type_impl, seastar::basic_sstring<char, unsigned int, 15u, true>, seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::vector<seastar::basic_sstring<signed char, unsigned int, 31u, false>, std::allocator<seastar::basic_sstring<signed char, unsigned int, 31u, false> > >, std::vector<seastar::shared_ptr<abstract_type const>, std::allocator<seastar::shared_ptr<abstract_type const> > >, bool>::_instances
0000000000000d0     56 TLS     LOCAL  HIDDEN        23 type_interning_helper<reversed_type_impl, seastar::shared_ptr<abstract_type const> >::_instances
000000000000108      4 TLS     LOCAL  HIDDEN        23 seastar::internal::this_shard_id_ptr()::g_this_shard_id
00000000000010c      4 TLS     LOCAL  HIDDEN        23 seastar::internal::current_scheduling_group_ptr()::sg
000000000000110      1 TLS     LOCAL  DEFAULT       23 __tls_guard
000000000000120   1056 TLS     LOCAL  HIDDEN        23 default_dirty_memory_manager
000000000000540      8 TLS     LOCAL  HIDDEN        23 guard variable for get_standard_migrator<blob_storage>()::instance
000000000000548     32 TLS     LOCAL  HIDDEN        23 get_standard_migrator<blob_storage>()::instance
000000000000568      1 TLS     LOCAL  DEFAULT       23 __tls_guard

It looks unlikely that these will be overrun (you can check this_shard_id, but if that is overwritten, the universe will likely collapse into a black hole).

Let's see if the problematic tasks are always the same.

I tried to see if the stall detector could be the problem (since it operates asynchronously via a signal, it would leave no trace), but can't see a problem there.

It would be awesome if we could get this working here: https://rr-project.org/

I think it's unlikely to work well, it multiplexes all threads onto a single thread in order to prevent races from causing indeterminism (haven't checked, but it must). So it will only work on simple cases.

It doesn't work at all on AMD cpus. :(

Let's see if the problematic tasks are always the same.

They are not, no two cores have the same task.

A solution would be to set up an automatic python HW watchpoint and record all IP of code accessing it but this requires GDB.

You could set up a hw watchpoint outside gdb. It will cause an exception that will create a core, but it requires some fiddling with assembly.

Actually, those instructions are privileged. But you can try using the ptrace API (warning: terrible) to trace yourself.

You could set up a hw watchpoint outside gdb. It will cause an exception that will create a core, but it requires some fiddling with assembly.

But we can't use a trap, most accesses are legit, we are looking for that supposed rouge access that's corrupting the value.

Actually, those instructions are privileged. But you can try using the ptrace API (warning: terrible) to trace yourself.

Yes, I've been reading the ptrace manpage today. You can fork() and ask your parent to trace you.

Looks like it's doable, but note that you'll have trouble if the overwrite comes from outside the thread, since you only have four hardware breakpoints.

Another option: change current_scheduling_group_ptr to point at a page you allocate initially, and mprotect() it each time you set it from a known-good location (and immediately mprotect it back). This is easier, but will affect performance.

Inspecting core core.scylla.996.d0c9a6414e9543b8b1a49a9d47349546.1288.1599036217000000
$1 = 3
$2 = {_id = 0}
$3 = 5 '\005'
$4 = {_id = 5}
0x603000077944: 0x4b8eb8 <_ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIJEEEZN7service13storage_proxy15mutate_internalIN5boost14iterator_rangeIN9__gnu_cxx17__normal_iteratorIP8mutationSt6vectorISB_SaISB_EEEEEEEENS_6futureIJEEET_N2db17consistency_levelEbN7tracing15trace_state_ptrE14service_permitSt8optionalINSt6chrono10time_pointINS_12lowres_clockENSR_8durationIlSt5ratioILl1ELl1000EEEEEEENS_13lw_shared_ptrIN3cdc24operation_result_trackerEEEEUlSJ_E0_ZZNSJ_17then_wrapped_nrvoISJ_S14_EENS_8futurizeISK_E4typeEOT0_ENKUlvE_clEvEUlRS3_RS14_ONS_12future_stateIJEEEE_JEEE+16>

Inspecting core core.scylla.996.c607b82f1e554fc3ae6b99bbe1535351.8155.1599036405000000
$1 = 1
$2 = {_id = 0}
$3 = 5 '\005'
$4 = {_id = 5}
0x601000077944: 0x5ab360 <_ZTVN7seastar12continuationINS_8internal22promise_base_with_typeIJSt7variantIJN5utils4UUIDEN7service5paxos7promiseEEEEEEZZNS_3rpc11send_helperIN4netw10serializerENSD_14messaging_verbES9_JRKN5query12read_commandERK13partition_keyRS5_RbRNSG_16digest_algorithmESt8optionalIN7tracing10trace_infoEEEEEDaT0_NSB_9signatureIFT1_DpT2_EEEEN7shelper4sendERNSB_6clientESR_INSt6chrono10time_pointINS_12lowres_clockENS15_8durationIlSt5ratioILl1ELl1000EEEEEEEPNSB_11cancellableESJ_SM_SN_SO_SQ_RKSU_EUlT_E_ZZNS_6futureIJSt5tupleIJNS1K_IJEEENS1K_IJS9_EEEEEEE14then_impl_nrvoIS1J_S1N_EESV_OS1I_ENKUlvE_clEvEUlRSA_RS1J_ONS_12future_stateIJS1O_EEEE_JS1O_EEE+16>

Inspecting core core.scylla.996.1ceb4383a0b4486792e06cce7f58e440.8003.1599036403000000
$1 = 0
$2 = {_id = 0}
$3 = 5 '\005'
$4 = {_id = 5}
0x600000367944: 0x7379b0 <_ZTVN7seastar11lambda_taskIZNS_15execution_stage5flushEvEUlvE_EE+16>

Inspecting core core.scylla.996.cf887550cc7841128fc6e56345b99f61.24506.1599036832000000
$1 = 4
$2 = {_id = 0}
$3 = 5 '\005'
$4 = {_id = 5}
0x604000077944: 0x46a4f0 <_ZTVN7seastar8internal24when_all_state_componentINS_6futureIJbEEEEE+16>

Variables:

$1: shard id
$2: seastar::internal::current_scheduling_group_ptr()::sg
$3: tq._id
$4: _current_task_sg
$5: vtable of current task

So we already overwrite w/ sg 0.

Got it:

reactor::run_some_tasks()
-> reactor::reset_preemption_monitor()
-> reactor_backend_aio::reset_preemption_monitor()
-> reactor_backend_aio::service_preempting_io()
-> reactor_backend_aio::hrtimer_aio_completion::complete_with()
-> reactor::service_highres_timer()
-> reactor::complete_timers()
-> assign to current scheduling group

RPM with the fix uploaded, download it as:

$ gsutil cp gs://upload.scylladb.com/bdenes/7117/scylla-server-4.2.rc3-0.20200903.1ecc284ca.x86_64.rpm .

Patch applied:
diff diff --git a/src/core/reactor.cc b/src/core/reactor.cc index d4f9bffa..6c5150ac 100644 --- a/src/core/reactor.cc +++ b/src/core/reactor.cc @@ -1206,6 +1206,7 @@ void reactor::complete_timers(T& timers, E& expired_timers, EnableFunc&& enable_ for (auto& t : expired_timers) { t._expired = true; } + const auto prev_sg = current_scheduling_group(); while (!expired_timers.empty()) { auto t = &*expired_timers.begin(); expired_timers.pop_front(); @@ -1223,7 +1224,7 @@ void reactor::complete_timers(T& timers, E& expired_timers, EnableFunc&& enable_ } } } - *internal::current_scheduling_group_ptr() = default_scheduling_group(); + *internal::current_scheduling_group_ptr() = prev_sg; enable_fn(); }

@aleksbykov please check that it solves the problem.

I left all the asserts in.

The patch looks good. Later, we should remove timer processing from the task processing loop, but that is much more involved.

Do send the patch, I think we can assume it will fix the problem, and we can validate in parallel.

btw it's not good that we entered the state with lots of tasks where we disable preemption.

Patch on the seastar list: [PATCH v1] core/reactor: complete_timers(): restore previous scheduling group

@denesb job failed after the Softreboot.
Node was rebooted with command:

2020-09-03T15:20:18+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !NOTICE  | sudo:  centos : TTY=unknown ; PWD=/home/centos ; USER=root ; COMMAND=/sbin/reboot

After node and scylla were up:

2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] gossip - No gossip backlog; proceeding
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] init - allow replaying hints
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] init - Launching generate_mv_updates for non system tables
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] init - starting the view builder
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 1] compaction - Compacting [/var/lib/scylla/data/system/truncated-38c19fd0fb863310a4b70d0cc66628aa/mc-41-big-Data.db:level=0, /var/lib/scylla/data/system/truncated-38c19fd0fb863310a4b70d0cc66628aa/mc-25-big-Data.db:level=0, ]
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] init - starting native transport
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] cql_server_controller - Starting listening for CQL clients on 10.0.3.226:9042 (unencrypted)
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] thrift_controller - Thrift server listening on 10.0.3.226:9160 ...
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] init - serving
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 0] init - Scylla version 4.2.rc3-0.20200827.48d79a1d9 initialization completed.
2020-09-03T15:22:36+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 1] compaction - Compacted 2 sstables to [/var/lib/scylla/data/system/truncated-38c19fd0fb863310a4b70d0cc66628aa/mc-49-big-Data.db:level=0, ]. 10kB to 5kB (~51% of original) in 116ms = 47kB/s. ~256 total partitions merged to 1.
2020-09-03T15:22:39+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | sshd[1450]: Accepted publickey for centos from 10.0.2.243 port 41374 ssh2: RSA SHA256:g7OmDbWC8qLUZHxjMohQe3LsQHfcOJbG1ZHqodB6smc

in 3 seconds next errors appeared and coredump triggered and continue triggering:

2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !WARNING | scylla: [shard 5] storage_proxy - Failed to apply mutation from 10.0.2.83#5: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !WARNING | scylla: [shard 5] storage_proxy - Failed to apply mutation from 10.0.2.83#5: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: [shard 5] rpc - client 10.0.2.83:63893 msg_id 93025:  exception "Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE." in no_wait handler ignored
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !WARNING | scylla: [shard 5] storage_proxy - Failed to apply mutation from 10.0.2.83#5: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !WARNING | scylla: [shard 5] storage_proxy - Failed to apply mutation from 10.0.2.83#5: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !WARNING | scylla: [shard 5] storage_proxy - Failed to apply mutation from 10.0.2.83#5: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !WARNING | scylla: [shard 5] storage_proxy - Failed to apply mutation from 10.0.2.83#5: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !WARNING | scylla: [shard 5] storage_proxy - Failed to apply mutation from 10.0.2.83#5: exceptions::mutation_write_timeout_exception (Operation timed out for system.paxos - received only 0 responses from 1 CL=ONE.)
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: scylla: /ScyllaDB/scylla2/seastar/src/core/reactor.cc:2144: void seastar::reactor::run_tasks(seastar::reactor::task_queue&): Assertion `current_scheduling_group() == scheduling_group(tq._id)' failed.
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: Aborting on shard 5.
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: Backtrace:
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002ec2492
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002e66860
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002e66b05
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002e66b50
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x00007ff1908eaa8f
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x000000000003c9e4
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000025894
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000025768
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000034e75
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002e63c6b
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002e63f7e
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002e9b1ed
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002eaabfa
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: 0x0000000002e2ec7d
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: /opt/scylladb/libreloc/libpthread.so.0+0x0000000000009431
2020-09-03T15:25:35+00:00  reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 !INFO    | scylla: /opt/scylladb/libreloc/libc.so.6+0x0000000000101912



md5-6debeac66e2dce3c92dc518c07023a06



2020-09-03 15:25:35.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 [13.48.24.28 | 10.0.3.226] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.a67bee8886f94a1482583767929e7c93.1272.1599146735000000/core.scylla.996.a67bee8886f94a1482583767929e7c93.1272.1599146735000000.gz
backtrace=           PID: 1272 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Thu 2020-09-03 15:25:35 UTC (2min 16s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: a67bee8886f94a1482583767929e7c93
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2
      Coredump: /var/lib/systemd/coredump/core.scylla.996.a67bee8886f94a1482583767929e7c93.1272.1599146735000000
       Message: Process 1272 (scylla) of user 996 dumped core.

                Stack trace of thread 1287:
                #0  0x00007ff18fe9d9e5 raise (libc.so.6)
                #1  0x00007ff18fe8694d abort (libc.so.6)
                #2  0x00007ff18fe86769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007ff18fe95e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007ff1908df432 start_thread (libpthread.so.0)
                #10 0x00007ff18ff62913 __clone (libc.so.6)

                Stack trace of thread 1296:
                #0  0x00007ff1908e99ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007ff1908df432 start_thread (libpthread.so.0)
                #5  0x00007ff18ff62913 __clone (libc.so.6)

                Stack trace of thread 1292:
                #0  0x00007ff1908e99ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007ff1908df432 start_thread (libpthread.so.0)
                #5  0x00007ff18ff62913 __clone (libc.so.6)

                Stack trace of thread 1290:
                #0  0x00007ff1908e99ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007ff1908df432 start_thread (libpthread.so.0)
                #5  0x00007ff18ff62913 __clone (libc.so.6)

                Stack trace of thread 1288:
                #0  0x00007ff18ff5d37d syscall (libc.so.6)
                #1  0x00000000030f9739 _ZN7seastar8internal9io_submitEmlPPNS0_9linux_abi4iocbE (scylla)
                #2  0x00000000030f63e8 _ZN7seastar19aio_storage_context11submit_workEv (scylla)
                #3  0x00000000030f6680 _ZN7seastar19reactor_backend_aio18kernel_submit_workEv (scylla)
                #4  0x0000000002e5861d _ZNSt17_Function_handlerIFbvEZN7seastar7reactor3runEvEUlvE4_E9_M_invokeERKSt9_Any_data (scylla)
                #5  0x0000000002e9b221 _ZN7seastar7reactor3runEv (scylla)
                #6  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #7  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #8  0x00007ff1908df432 start_thread (libpthread.so.0)
                #9  0x00007ff18ff62913 __clone (libc.so.6)

                Stack trace of thread 1297:
                #0  0x00007ff1908e99ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007ff1908df432 start_thread (libpthread.so.0)
                #5  0x00007ff18ff62913 __clone (libc.so.6)

                Stack trace of thread 1286:
                #0  0x0000000002e58664 _ZNSt17_Function_handlerIFbvEZN7seastar7reactor3runEvEUlvE4_E9_M_invokeERKSt9_Any_data (scylla)
                #1  0x0000000002e9b221 _ZN7seastar7reactor3runEv (scylla)
                #2  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007ff1908df432 start_thread

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.a67bee8886f94a1482583767929e7c93.1272.1599146735000000/core.scylla.996.a67bee8886f94a1482583767929e7c93.1272.1599146735000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.a67bee8886f94a1482583767929e7c93.1272.1599146735000000.gz

2020-09-03 15:28:31.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 [13.48.24.28 | 10.0.3.226] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.a67bee8886f94a1482583767929e7c93.3388.1599146911000000/core.scylla.996.a67bee8886f94a1482583767929e7c93.3388.1599146911000000.gz
backtrace=           PID: 3388 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Thu 2020-09-03 15:28:31 UTC (2min 29s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: a67bee8886f94a1482583767929e7c93
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2
      Coredump: /var/lib/systemd/coredump/core.scylla.996.a67bee8886f94a1482583767929e7c93.3388.1599146911000000
       Message: Process 3388 (scylla) of user 996 dumped core.

                Stack trace of thread 3391:
                #0  0x00007fb5388359e5 raise (libc.so.6)
                #1  0x00007fb53881e94d abort (libc.so.6)
                #2  0x00007fb53881e769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007fb53882de76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007fb539277432 start_thread (libpthread.so.0)
                #10 0x00007fb5388fa913 __clone (libc.so.6)

                Stack trace of thread 3400:
                #0  0x00007fb5392819ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fb539277432 start_thread (libpthread.so.0)
                #5  0x00007fb5388fa913 __clone (libc.so.6)

                Stack trace of thread 3403:
                #0  0x00007fb5392819ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fb539277432 start_thread (libpthread.so.0)
                #5  0x00007fb5388fa913 __clone (libc.so.6)

                Stack trace of thread 3401:
                #0  0x00007fb5392819ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007fb539277432 start_thread (libpthread.so.0)
                #5  0x00007fb5388fa913 __clone (libc.so.6)

                Stack trace of thread 3394:
                #0  0x000000000269a6ee _ZN13unsigned_vint11deserializeESt17basic_string_viewIaSt11char_traitsIaEE (scylla)
                #1  0x00000000012a7bae _ZN8sstables27index_consume_entry_contextINS_14index_consumerEE13process_stateERN7seastar16temporary_bufferIcEE (scylla)
                #2  0x00000000012a9865 _ZN7seastar8futurizeINS_6futureIJNS_10bool_classINS_18stop_iteration_tagEEEEEEE6invokeIRZNS_12input_streamIcE7consumeISt17reference_wrapperIN8sstables27index_consume_entry_contextINSC_14index_consumerEEEEEENS1_IJEEEOT_EUlvE_JEEES5_SJ_DpOT0_ (scylla)
                #3  0x00000000012a9f73 _ZN7seastar8internal8repeaterIZNS_12input_streamIcE7consumeISt17reference_wrapperIN8sstables27index_consume_entry_contextINS6_14index_consumerEEEEEENS_6futureIJEEEOT_EUlvE_E15run_and_disposeEv (scylla)
                #4  0x0000000002e63bd6 _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007fb539277432 start_thread (libpthread.so.0)
                #10 0x00007fb5388fa913 __clone (libc.so.6)

                Stack trace of thread 3395:
                #0  0x00000000012a76c9 _ZN8sstables27index_consume_entry_contextINS_14index_consumerEE13process_stateERN7seastar16temporary_bufferIcEE (scylla)
                #1  0x00000000012a9865 _ZN7seastar8futurizeINS_6futureIJNS_10bool_classINS_18stop_iteration_tagEEEEEEE6invokeIRZNS_12input_streamIcE7consumeISt17reference_wrapperIN8sstables27index_consume_entry_contextINSC_14index_consumerEEEEEENS1_IJEEEOT_EUlvE_JEEES5_SJ_DpOT0_ (scylla)
                #2  0x00000000012a9f73 _

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.a67bee8886f94a1482583767929e7c93.3388.1599146911000000/core.scylla.996.a67bee8886f94a1482583767929e7c93.3388.1599146911000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.a67bee8886f94a1482583767929e7c93.3388.1599146911000000.gz

2020-09-03 15:31:38.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 [13.48.24.28 | 10.0.3.226] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.a67bee8886f94a1482583767929e7c93.4734.1599147098000000/core.scylla.996.a67bee8886f94a1482583767929e7c93.4734.1599147098000000.gz
backtrace=           PID: 4734 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Thu 2020-09-03 15:31:38 UTC (2min 41s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: a67bee8886f94a1482583767929e7c93
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2
      Coredump: /var/lib/systemd/coredump/core.scylla.996.a67bee8886f94a1482583767929e7c93.4734.1599147098000000
       Message: Process 4734 (scylla) of user 996 dumped core.

                Stack trace of thread 4738:
                #0  0x00007f9710f349e5 raise (libc.so.6)
                #1  0x00007f9710f1d94d abort (libc.so.6)
                #2  0x00007f9710f1d769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f9710f2ce76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f9711976432 start_thread (libpthread.so.0)
                #10 0x00007f9710ff9913 __clone (libc.so.6)

                Stack trace of thread 4782:
                #0  0x00007f97119809ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f9711976432 start_thread (libpthread.so.0)
                #5  0x00007f9710ff9913 __clone (libc.so.6)

                Stack trace of thread 4780:
                #0  0x00007f97119809ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f9711976432 start_thread (libpthread.so.0)
                #5  0x00007f9710ff9913 __clone (libc.so.6)

                Stack trace of thread 4777:
                #0  0x00007f97119809ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f9711976432 start_thread (libpthread.so.0)
                #5  0x00007f9710ff9913 __clone (libc.so.6)

                Stack trace of thread 4739:
                #0  0x00007f9711187a9e _ZSt11_Hash_bytesPKvmm (libstdc++.so.6)
                #1  0x0000000001a45ce3 _ZNSt8__detail9_Map_baseIN7seastar13basic_sstringIcjLj15ELb1EEESt4pairIKS3_N7service19storage_proxy_stats11split_stats13stats_counterEESaISA_ENS_10_Select1stESt8equal_toIS3_ESt4hashIS3_ENS_18_Mod_range_hashingENS_20_Default_ranged_hashENS_20_Prime_rehash_policyENS_17_Hashtable_traitsILb1ELb0ELb1EEELb1EEixERS5_ (scylla)
                #2  0x00000000019b5c5e _ZN7service13storage_proxy22send_to_live_endpointsEmNSt6chrono10time_pointIN7seastar12lowres_clockENS1_8durationIlSt5ratioILl1ELl1000EEEEEE (scylla)
                #3  0x00000000019b8eaf _ZN7seastar8futurizeINS_6futureIJEEEE6invokeIRPFS2_RSt6vectorIN7service13storage_proxy23unique_response_handlerESaIS8_EEOZNS7_12mutate_beginESA_N2db17consistency_levelEN7tracing15trace_state_ptrESt8optionalINSt6chrono10time_pointINS_12lowres_clockENSH_8durationIlSt5ratioILl1ELl1000EEEEEEEEUlRS8_E_EJSB_SR_EEES2_OT_DpOT0_.constprop.0.isra.0 (scylla)
                #4  0x0000000001a87880 _ZN7service13storage_proxy15mutate_internalISt5arrayISt5tupleIJN7seastar13lw_shared_ptrINS_5paxos8proposalEEENS5_IK6schemaEENS4_10shared_ptrINS_22paxos_response_handlerEEEN3dht5tokenEEELm1EEEENS4_6futureIJEEET_N2db17consistency_levelEbN7tracing15trace_state_ptrE14service_permitSt8optionalINSt6chrono10time_pointINS4_12lowres_clockENSS_8durationIlSt5ratioILl1ELl1000EEEEEEENS5_IN3cdc24operation_result_trackerEEE (scylla)
                #5  0x00000000019da85b _ZN7service22paxos_response_handler14learn_decisionEN7seastar13lw_shared_ptrINS_5paxos8proposalEEEb (scylla)
                #6  0x00000000019e14c2 _ZZZZZZZN7service13storage_proxy3casEN7seastar13lw_shared_ptrIK6schemaEENS1_10shared_ptrINS_11cas_requestEEENS2_IN5query12read_commandEEEOSt6vectorI20nonwrapping_intervalIN3dht13ring_positi

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.a67bee8886f94a1482583767929e7c93.4734.1599147098000000/core.scylla.996.a67bee8886f94a1482583767929e7c93.4734.1599147098000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.a67bee8886f94a1482583767929e7c93.4734.1599147098000000.gz

2020-09-03 17:17:08.000: (CoreDumpEvent Severity.ERROR): node=Node reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2 [13.48.24.28 | 10.0.3.226] (seed: False)
corefile_url=
https://storage.cloud.google.com/upload.scylladb.com/core.scylla.996.a67bee8886f94a1482583767929e7c93.12161.1599153428000000/core.scylla.996.a67bee8886f94a1482583767929e7c93.12161.1599153428000000.gz
backtrace=           PID: 12161 (scylla)
           UID: 996 (scylla)
           GID: 1001 (scylla)
        Signal: 6 (ABRT)
     Timestamp: Thu 2020-09-03 17:17:08 UTC (2min 52s ago)
  Command Line: /usr/bin/scylla --blocked-reactor-notify-ms 500 --abort-on-lsa-bad-alloc 1 --abort-on-seastar-bad-alloc --abort-on-internal-error 1 --abort-on-ebadf 1 --enable-sstable-key-validation 1 --log-to-syslog 1 --log-to-stdout 0 --default-log-level info --network-stack posix --io-properties-file=/etc/scylla.d/io_properties.yaml --cpuset 0-7 --lock-memory=1
    Executable: /opt/scylladb/libexec/scylla
 Control Group: /
       Boot ID: a67bee8886f94a1482583767929e7c93
    Machine ID: df877a200226bc47d06f26dae0736ec9
      Hostname: reproduce-7117-longevity-lwt-24h-re-db-node-bf54572f-2
      Coredump: /var/lib/systemd/coredump/core.scylla.996.a67bee8886f94a1482583767929e7c93.12161.1599153428000000
       Message: Process 12161 (scylla) of user 996 dumped core.

                Stack trace of thread 12169:
                #0  0x00007f252172d9e5 raise (libc.so.6)
                #1  0x00007f252171694d abort (libc.so.6)
                #2  0x00007f2521716769 __assert_fail_base.cold (libc.so.6)
                #3  0x00007f2521725e76 __assert_fail (libc.so.6)
                #4  0x0000000002e63c6c _ZN7seastar7reactor9run_tasksERNS0_10task_queueE (scylla)
                #5  0x0000000002e63f7f _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #6  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #7  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #8  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #9  0x00007f252216f432 start_thread (libpthread.so.0)
                #10 0x00007f25217f2913 __clone (libc.so.6)

                Stack trace of thread 12171:
                #0  0x00007f25221799ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f252216f432 start_thread (libpthread.so.0)
                #5  0x00007f25217f2913 __clone (libc.so.6)

                Stack trace of thread 12177:
                #0  0x00007f25221799ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f252216f432 start_thread (libpthread.so.0)
                #5  0x00007f25217f2913 __clone (libc.so.6)

                Stack trace of thread 12175:
                #0  0x00007f25221799ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f252216f432 start_thread (libpthread.so.0)
                #5  0x00007f25217f2913 __clone (libc.so.6)

                Stack trace of thread 12174:
                #0  0x00007f25221799ac read (libpthread.so.0)
                #1  0x00000000030f2a77 _ZN7seastar11thread_pool4workENS_13basic_sstringIcjLj15ELb1EEE (scylla)
                #2  0x00000000030f2cd8 _ZNSt17_Function_handlerIFvvEZN7seastar11thread_poolC4EPNS1_7reactorENS1_13basic_sstringIcjLj15ELb1EEEEUlvE_E9_M_invokeERKSt9_Any_data (scylla)
                #3  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #4  0x00007f252216f432 start_thread (libpthread.so.0)
                #5  0x00007f25217f2913 __clone (libc.so.6)

                Stack trace of thread 12163:
                #0  0x00007f25221799ac read (libpthread.so.0)
                #1  0x00000000030f6890 _ZN7seastar25task_quota_aio_completion13complete_withEl (scylla)
                #2  0x00000000030f3933 _ZN7seastar19reactor_backend_aio24reset_preemption_monitorEv (scylla)
                #3  0x0000000002e63e1e _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #4  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #5  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #6  0x0000000002e2ec7e _ZN7seastar12posix_thread13start_routineEPv (scylla)
                #7  0x00007f252216f432 start_thread (libpthread.so.0)
                #8  0x00007f25217f2913 __clone (libc.so.6)

                Stack trace of thread 12167:
                #0  0x00007f25221799ac read (libpthread.so.0)
                #1  0x00000000030f6890 _ZN7seastar25task_quota_aio_completion13complete_withEl (scylla)
                #2  0x00000000030f3933 _ZN7seastar19reactor_backend_aio24reset_preemption_monitorEv (scylla)
                #3  0x0000000002e63e1e _ZN7seastar7reactor14run_some_tasksEv.part.0 (scylla)
                #4  0x0000000002e9b1ee _ZN7seastar7reactor3runEv (scylla)
                #5  0x0000000002eaabfb _ZZN7seastar3smp9configureEN5boost15program_options13variables_mapENS_14reactor_configEENKUlvE1_clEv (scylla)
                #6  0x0000000002e2ec7e _ZN7seas

download_instructions=
gsutil cp gs://upload.scylladb.com/core.scylla.996.a67bee8886f94a1482583767929e7c93.12161.1599153428000000/core.scylla.996.a67bee8886f94a1482583767929e7c93.12161.1599153428000000.gz .
gunzip /var/lib/systemd/coredump/core.scylla.996.a67bee8886f94a1482583767929e7c93.12161.1599153428000000.gz

@aleksbykov I checked two cores and they are both generated by the previous version that doesn't have the fix:

$ eu-unstrip -n --core=core.scylla.996.a67bee8886f94a1482583767929e7c93.1272.1599146735000000
0x200000+0x3599000 1a255ac77250f25a819655dd7ebe3a585b3a56da@0x20058c - - /opt/scylladb/libexec/scylla
$ eu-unstrip -n --core=./core.scylla.996.a67bee8886f94a1482583767929e7c93.3388.1599146911000000
0x200000+0x3599000 1a255ac77250f25a819655dd7ebe3a585b3a56da@0x20058c - - /opt/scylladb/libexec/scylla

$ eu-unstrip -n --exec=./scylla-1ecc284ca # latest version, with the fix
0x200000+0x35983d0 139e80ab70f9294fb594a452621758adaf08242e@0x20058c ./scylla-1ecc284ca . 
$ eu-unstrip -n --exec=./scylla-eb863d01a # previous version, without the fix
0x200000+0x3598490 1a255ac77250f25a819655dd7ebe3a585b3a56da@0x20058c ./scylla-eb863d01a . 

Either I packaged the wrong version, or the test was run with the previous package. Can you please check that the version of scylla is 4.2.rc3-0.20200903.1ecc284ca.

@denesb next rpm was installed:

< t:2020-09-03 13:37:38,700 f:base.py         l:187  c:RemoteCmdRunner      p:DEBUG > scylla-server.x86_64               4.2.rc3-0.20200901.eb863d01a  installed      

@denesb next rpm was installed:

< t:2020-09-03 13:37:38,700 f:base.py         l:187  c:RemoteCmdRunner      p:DEBUG > scylla-server.x86_64               4.2.rc3-0.20200901.eb863d01a  installed      

This is the previous one, the new one is here.

@denesb see the SCYLLA_BUILD field in SCYLLA-VERSION-GEN for a place to stick more info about a build.

@denesb see the SCYLLA_BUILD field in SCYLLA-VERSION-GEN for a place to stick more info about a build.

I though I can omit this step if I make a commit each time and then flush all SCYLLA- files ensuring the commit hash as well as the date is different in each binary (and hence RPM).

It just makes it more explicit. The date/time and commit hash are sufficient for uniqueness, the build field just stands out more (including the fact that it's outside a formal branch). So it's a little bit more convenient when sharing builds.

Ok, will set it in the future.

@denesb am I correct that only 4.2 is vulnerable for this crash? I'll backport it to the other versions, because it has other bad side effects, but I'll open a new issue for those side effects if the crash doesn't apply.

@denesb Run with latest fix gs://upload.scylladb.com/bdenes/7117/scylla-server-4.2.rc3-0.20200903.1ecc284ca.x86_64.rpm was passed.

@denesb am I correct that only 4.2 is vulnerable for this crash? I'll backport it to the other versions, because it has other bad side effects, but I'll open a new issue for those side effects if the crash doesn't apply.

Yes, only 4.2 selects the semaphore based on the scheduling group and has the semaphore validation.

Botond D茅nes notifications@github.com writes:

It would be awesome if we could get this working here: https://rr-project.org/

I don't think it supports AIO. Should be possible to add if we always
use a syscall by forcing linux-aio.cc:usable to return false like we do
for valgrind.

Cheers,
Rafael

Was this page helpful?
0 / 5 - 0 ratings

Related issues

duarten picture duarten  路  5Comments

tzach picture tzach  路  3Comments

LouayKamel picture LouayKamel  路  6Comments

eyalgutkind picture eyalgutkind  路  3Comments

fgelcer picture fgelcer  路  6Comments