This is Scylla's bug tracker, to be used for reporting bugs only.
If you have a question about Scylla, and not a bug, please ask it in
our mailing-list at [email protected] or in our slack channel.
Installation details
Scylla version (or git commit hash): 4.4.0
Cluster size: 9
OS (RHEL/CentOS/Ubuntu/AWS AMI): Debian
We have a test deployment using 4.4.0 with TWCS for a fairly large table (1 hour time windows). We just upgraded from 4.3.y (separate deployment and transitioned from LCS, so it's a bit difficult to pinpoint if this is new with 4.4 or an TWCS vs LCS issue). We're seeing a high occurrence of backtraces like this:
[shard 0] storage_proxy - Exception when communicating with <another_node_ip>, to read from <table>: std::runtime_error (reader was forwarded before returning partition start Backtrace: 0x3c7720e
0x3c77680
0x3c77a08
0x38b8b52
0x38b89f6
0x38b8910
0x38b8794
0x38b868f
0x13e34d2
0x13c5326
0x13c2fde
0x1175e36
0x1177735
0x38ed6cf
0x38ee8b7
0x388c785
0x388bba6
0xdfa15c
/opt/scylladb/libreloc/libc.so.6+0x281e1
0xdf704d
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda()#1}, seastar::future>({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::internal::do_until_state<cache::cache_flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#2}, cache::cache_flat_mutation_reader::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#3}>
--------
seastar::continuation<seastar::internal::promise_base_with_type<seastar::optimized_optional<mutation_fragment> >, flat_mutation_reader::impl::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda()#1}, flat_mutation_reader::impl::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}<seastar::optimized_optional<mutation_fragment> > >({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<seastar::optimized_optional<mutation_fragment> >&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<mutation_reader_merger::needs_merge_tag> >, mutation_reader_merger::prepare_one(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, mutation_reader_merger::reader_and_last_fragment_kind, seastar::bool_class<mutation_reader_merger::reader_galloping_tag>)::$_3, seastar::future<seastar::optimized_optional<mutation_fragment> >::then_impl_nrvo<mutation_reader_merger::prepare_one(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, mutation_reader_merger::reader_and_last_fragment_kind, seastar::bool_class<mutation_reader_merger::reader_galloping_tag>)::$_3, seastar::future<seastar::bool_class<mutation_reader_merger::needs_merge_tag> > >(mutation_reader_merger::prepare_one(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, mutation_reader_merger::reader_and_last_fragment_kind, seastar::bool_class<mutation_reader_merger::reader_galloping_tag>)::$_3&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<mutation_reader_merger::needs_merge_tag> >&&, mutation_reader_merger::prepare_one(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, mutation_reader_merger::reader_and_last_fragment_kind, seastar::bool_class<mutation_reader_merger::reader_galloping_tag>)::$_3&, seastar::future_state<seastar::optimized_optional<mutation_fragment> >&&)#1}, seastar::optimized_optional<mutation_fragment> >
--------
N7seastar12continuationINS_8internal22promise_base_with_typeIvEEZNS_6futureINS_10bool_classIN22mutation_reader_merger15needs_merge_tagEEEE14discard_resultEvEUlDpOT_E_ZNS9_14then_impl_nrvoISD_NS4_IvEEEET0_OT_EUlOS3_RSD_ONS_12future_stateIS8_EEE_S8_EE
--------
seastar::parallel_for_each_state
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, mutation_reader_merger::prepare_next(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::$_1, seastar::future<void>::then_impl_nrvo<mutation_reader_merger::prepare_next(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::$_1, seastar::future<void> >(mutation_reader_merger::prepare_next(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::$_1&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, mutation_reader_merger::prepare_next(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::$_1&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<boost::iterator_range<mutation_fragment*> >, mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::$_6, seastar::future<void>::then_impl_nrvo<mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::$_6, seastar::future<boost::iterator_range<mutation_fragment*> > >(mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::$_6&&)::{lambda(seastar::internal::promise_base_with_type<boost::iterator_range<mutation_fragment*> >&&, mutation_reader_merger::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::$_6&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, mutation_fragment_merger<mutation_reader_merger>::fetch(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda(boost::iterator_range<mutation_fragment*>)#1}, seastar::future<boost::iterator_range<mutation_fragment*> >::then_impl_nrvo<{lambda(boost::iterator_range<mutation_fragment*>)#1}, mutation_fragment_merger<mutation_reader_merger>::fetch(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda(boost::iterator_range<mutation_fragment*>)#1}<void> >({lambda(boost::iterator_range<mutation_fragment*>)#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(boost::iterator_range<mutation_fragment*>)#1}&, seastar::future_state<boost::iterator_range<mutation_fragment*> >&&)#1}, boost::iterator_range<mutation_fragment*> >
--------
seastar::continuation<seastar::internal::promise_base_with_type<seastar::optimized_optional<mutation_fragment> >, mutation_fragment_merger<mutation_reader_merger>::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda()#1}, mutation_fragment_merger<mutation_reader_merger>::operator()(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}<seastar::optimized_optional<mutation_fragment> > >({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<seastar::optimized_optional<mutation_fragment> >&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >, merging_reader<mutation_reader_merger>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}::operator()() const::{lambda(seastar::optimized_optional<mutation_fragment>)#1}, seastar::future<mutation_fragment>::then_impl_nrvo<seastar::optimized_optional<mutation_fragment>, {lambda(seastar::optimized_optional<mutation_fragment>)#1}<seastar::bool_class<seastar::stop_iteration_tag> > >(seastar::optimized_optional<mutation_fragment>&&)::{lambda(seastar::internal::promise_base_with_type<seastar::bool_class<seastar::stop_iteration_tag> >&&, seastar::optimized_optional<mutation_fragment>&, seastar::future_state<mutation_fragment>&&)#1}, mutation_fragment>
--------
seastar::internal::repeater<merging_reader<mutation_reader_merger>::fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}>
--------
seastar::continuation<seastar::internal::promise_base_with_type<mutation_fragment*>, flat_mutation_reader::peek(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda()#1}, flat_mutation_reader::peek(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}<mutation_fragment*> >({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<mutation_fragment*>&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
--------
seastar::continuation<seastar::internal::promise_base_with_type<std::tuple<std::optional<clustering_key_prefix> > >, query::consume_page<(emit_only_live_rows)1, query_result_builder>(flat_mutation_reader&, seastar::lw_shared_ptr<compact_mutation_state<(emit_only_live_rows)1, (compact_for_sstables)0> >, query::partition_slice const&, query_result_builder&&, unsigned long, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, query::max_result_size)::{lambda(mutation_fragment*)#1}, seastar::future<mutation_fragment*>::then_impl_nrvo<{lambda(mutation_fragment*)#1}, query::consume_page<(emit_only_live_rows)1, query_result_builder>(flat_mutation_reader&, seastar::lw_shared_ptr<compact_mutation_state<(emit_only_live_rows)1, (compact_for_sstables)0> >, query::partition_slice const&, query_result_builder&&, unsigned long, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, query::max_result_size)::{lambda(mutation_fragment*)#1}<std::tuple<std::optional<clustering_key_prefix> > > >({lambda(mutation_fragment*)#1}&&)::{lambda(seastar::internal::promise_base_with_type<std::tuple<std::optional<clustering_key_prefix> > >&&, {lambda(mutation_fragment*)#1}&, seastar::future_state<mutation_fragment*>&&)#1}, mutation_fragment*>
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, query::querier<(emit_only_live_rows)1>::consume_page<query_result_builder>(query_result_builder&&, unsigned long, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, query::max_result_size)::{lambda(auto:1&&)#1}, seastar::future<std::tuple<std::optional<clustering_key_prefix> > >::then_impl_nrvo<{lambda(auto:1&&)#1}, query::querier<(emit_only_live_rows)1>::consume_page<query_result_builder>(query_result_builder&&, unsigned long, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, query::max_result_size)::{lambda(auto:1&&)#1}<void> >(query_result_builder&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda(auto:1&&)#1}&, seastar::future_state<std::optional<clustering_key_prefix> >&&)#1}, std::optional<clustering_key_prefix> >
--------
seastar::continuation<seastar::internal::promise_base_with_type<void>, data_query(seastar::lw_shared_ptr<schema const>, mutation_source const&, nonwrapping_interval<dht::ring_position> const&, query::partition_slice const&, unsigned long, unsigned int, std::chrono::time_point<gc_clock, std::chrono::duration<long, std::ratio<1l, 1l> > >, query::result::builder&, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >, query::query_class_config, tracing::trace_state_ptr, query::querier_cache_context)::$_25::operator()(query::querier<(emit_only_live_rows)1>&)::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda()#1}, seastar::future>({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>
)
How should we go about debugging this further?
Thanks!
@denesb / @bhalevy is this md format related ?
@slivne no, this is related to @kbr-'s TWCS improvements, @kbr- please advise.
For sure we need the backtrace to be resolved to get started on the investigation, @lseelenbinder can you please resolve the backtrace for use, see here for instructions.
Here's the backtrace @denesb:
/opt/scylladb/libexec/scylla: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /opt/scylladb/libreloc/ld.so, for GNU/Linux 3.2.0, BuildID[sha1]=71a06d8290443f3abdd4f67911ab332db44d9a51, stripped
0x3c77680
0x3c77a08
0x38b8b52
0x38b89f6
0x38b8910
0x38b8794
0x38b868f
0x13e34d2
0x13c5326
0x13c2fde
0x1175e36
0x1177735
0x38ed6cf
0x38ee8b7
0x388c785
0x388bba6
0xdfa15c
/opt/scylladb/libreloc/libc.so.6+0x281e1
0xdf704d
[Backtrace #0]
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > std::operator+<char, std::char_traits<char>, std::allocator<char> >(char, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) at ??:?
std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf() at ??:?
std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf() at ??:?
std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf() at ??:?
std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf() at ??:?
std::__cxx11::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf() at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::reserve(unsigned long) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::append(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) at ??:?
std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::~basic_string() at ??:?
6[Backtrace #1]
?? ??:0
?? ??:0
Not sure how helpful that is? 😄
Ahh I remember we had a long discussion with @tgrabiec and @haaawk whether readers can be position-forwarded when not "inside partition" (i.e. before returning partition_start / after returning partition_end), and we agreed that no, they cannot be, resulting in PR #7679. I left a check in clustering_order_reader_merger that tries to catch when this assumption is broken, and now it catched cache_flat_mutation_reader red-handed :(
Need to investigate in what circumstances exactly cache_flat_mutation_reader breaks this assumption. I recall analyzing it and not finding a possible codepath, must have missed something (or there was a recent regression?).
Another possibility would be clustering_order_reader_merger not updating _partition_start_fetched correctly, but from quick analysis of code I don't think that's possible. More likely that the assumption was truly broken.
One more detail on our side. The partition / clustering keys aren't increasing timestamps. We're doing a lot of TTL expiration with lots of writes/reads, so we were hitting up against limitations / performance issues with LCS (lots of write and read amplification due to excessive compactions), so we're experimenting with TWCS since it _is_ all the same TTL.
@lseelenbinder could you show the definition of this table (describe table ...)? (can rename the columns if needed). May turn out to be useful.
Also, was the transition from LCS done using alter table?
describe table:
CREATE TABLE test.test (
a int,
b int,
c int,
d int,
e int,
f tinyint,
g timestamp,
h int,
data blob,
hash uuid,
PRIMARY KEY ((a, b, c, d), e, f)
) WITH CLUSTERING ORDER BY (e ASC, f ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'}
AND comment = ''
AND compaction = {'class': 'TimeWindowCompactionStrategy', 'compaction_window_size': '1', 'compaction_window_unit': 'HOURS'}
AND compression = {'chunk_length_in_kb': '32', 'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.0
AND default_time_to_live = 0
AND gc_grace_seconds = 14400
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';
They were created this way, no alter table… since we're testing on a new deployment.
@lseelenbinder would it be possible for you to determine what query exactly causes the error to appear? Can it be reproduced with a single query at a time, or do you have to perform many concurrent queries? Are there writes also running in the meantime? Any ongoing compactions?
BTW. managed to resolve the backtrace using binary downloaded from Scylla homepage:
[Backtrace #0]
seastar::current_tasktrace() at ./build/release/seastar/./seastar/src/util/backtrace.cc:135
seastar::current_backtrace() at ./build/release/seastar/./seastar/src/util/backtrace.cc:168
backtraced<seastar::internal::backtraced<std::runtime_error> &> at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:183
std::__exception_ptr::exception_ptr std::make_exception_ptr<seastar::internal::backtraced<std::runtime_error> >(seastar::internal::backtraced<std::runtime_error>) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/exception_ptr.h:195
std::__exception_ptr::exception_ptr seastar::make_backtraced_exception_ptr<std::runtime_error, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:211
void seastar::throw_with_backtrace<std::runtime_error, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) at ./build/release/seastar/./seastar/include/seastar/util/backtrace.hh:227
seastar::on_internal_error(seastar::logger&, std::basic_string_view<char, std::char_traits<char> >) at ./build/release/seastar/./seastar/src/core/on_internal_error.cc:41
clustering_order_reader_merger::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ./mutation_reader.cc:2659
(inlined by) mutation_fragment_merger<clustering_order_reader_merger>::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ./mutation_reader.cc:143
(inlined by) merging_reader<clustering_order_reader_merger>::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ./mutation_reader.cc:638
flat_mutation_reader::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ././flat_mutation_reader.hh:448
(inlined by) operator() at ./mutation_reader.cc:773
(inlined by) decltype(auto) restricting_mutation_reader::with_reader<restricting_mutation_reader::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda(flat_mutation_reader&)#1}>(restricting_mutation_reader::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda(flat_mutation_reader&)#1}, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ./mutation_reader.cc:718
restricting_mutation_reader::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ./mutation_reader.cc:772
flat_mutation_reader::fast_forward_to(position_range, std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ././flat_mutation_reader.hh:448
(inlined by) cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >) at ././cache_flat_mutation_reader.hh:269
operator() at ././cache_flat_mutation_reader.hh:262
(inlined by) seastar::future<void> std::__invoke_impl<seastar::future<void>, cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&>(std::__invoke_other, cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:60
(inlined by) std::__invoke_result<cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&>::type std::__invoke<cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&>(std::__invoke_result&&, (cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&)...) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/bits/invoke.h:95
(inlined by) std::invoke_result<cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&>::type std::invoke<cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&>(std::invoke_result&&, (cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&)...) at /usr/lib/gcc/x86_64-redhat-linux/10/../../../../include/c++/10/functional:88
(inlined by) auto seastar::internal::future_invoke<cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&, seastar::internal::monostate>(cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}&, seastar::internal::monostate&&) at ././seastar/include/seastar/core/future.hh:1209
(inlined by) operator() at ././seastar/include/seastar/core/future.hh:1582
(inlined by) _ZN7seastar8futurizeINS_6futureIvEEE22satisfy_with_result_ofIZZNS2_14then_impl_nrvoIZN5cache26cache_flat_mutation_reader14do_fill_bufferENSt6chrono10time_pointINS_12lowres_clockENS8_8durationIlSt5ratioILl1ELl1000EEEEEEEUlvE_S2_EET0_OT_ENKUlONS_8internal22promise_base_with_typeIvEERSG_ONS_12future_stateINSK_9monostateEEEE_clESN_SO_SS_EUlvE_EEvSN_SJ_ at ././seastar/include/seastar/core/future.hh:2120
(inlined by) operator() at ././seastar/include/seastar/core/future.hh:1575
(inlined by) seastar::continuation<seastar::internal::promise_base_with_type<void>, cache::cache_flat_mutation_reader::do_fill_buffer(std::chrono::time_point<seastar::lowres_clock, std::chrono::duration<long, std::ratio<1l, 1000l> > >)::{lambda()#1}, seastar::future<void>::then_impl_nrvo<{lambda()#1}, seastar::future>({lambda()#1}&&)::{lambda(seastar::internal::promise_base_with_type<void>&&, {lambda()#1}&, seastar::future_state<seastar::internal::monostate>&&)#1}, void>::run_and_dispose() at ././seastar/include/seastar/core/future.hh:767
seastar::reactor::run_tasks(seastar::reactor::task_queue&) at ./build/release/seastar/./seastar/src/core/reactor.cc:2220
(inlined by) seastar::reactor::run_some_tasks() at ./build/release/seastar/./seastar/src/core/reactor.cc:2629
seastar::reactor::run() at ./build/release/seastar/./seastar/src/core/reactor.cc:2788
seastar::app_template::run_deprecated(int, char**, std::function<void ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:207
seastar::app_template::run(int, char**, std::function<seastar::future<int> ()>&&) at ./build/release/seastar/./seastar/src/core/app-template.cc:115
@kbr-
would it be possible for you to determine what query exactly causes the error to appear?
We pretty much run two queries.
SELECT * FROM … WHERE … LIMIT 1 based on the partition key (it will exist and be unique or not exist).We always have concurrency, and significant levels of it. Minimally around 200-300qps of SELECT and 100qps of INSERT across the cluster.
I didn't notice any compactions and the logged exceptions certainly appear without compaction activity (though it's entirely possible the exception occurs on one node and another node has a running compaction). I did notice after another look through the logs that the exceptions tend to cluster for a few seconds at a time and then disappear for up to a minute or two (sometimes even longer), and then return.
Thanks for the info.
A simple SELECT * FROM … WHERE … LIMIT 1 based on the partition key (it will exist and be unique or not exist).
Out of curiosity: why the LIMIT 1 if you specify the exact key (so there can be at most one row anyway)?
Out of curiosity: why the LIMIT 1 if you specify the exact key (so there can be at most one row anyway)?
No good reason! I think it was written as an overly cautious implementation initially and could be dispensed with now. Could this cause issues (performance or otherwise)?
Could this cause issues (performance or otherwise)?
Should not cause correctness issues (if there was any difference between this query and a version without LIMIT 1, that would be a bug); maybe some minor performance hit since potentially additional code must be executed.
Wait, @lseelenbinder . I just noticed you said "partition key", not "primary key" (easy to confuse visually...)
If you really meant "partition key" (i.e. without the clustering key) then yes, LIMIT 1 does matter here. But if you specify the entire primary key (both partition and clustering parts), then no.
@kbr-, oops. That was a mistake on my end. We fully specify partition and clustering key, so it's the full primary key. Our application code actually discards all but the first row anyways, so it'd only ever be a bytes-over-the-wire optimization. 😄
@lseelenbinder the fixes were backported to the 4.4 branch, meaning that they will appear in the next 4.4 patch release.
Thanks @kbr- ! Is there a projected release date for the next patch release?
@slivne would probably know that. Shlomi, when could we expect the next 4.4 patch release, assuming everything goes smoothly (no problems during testing etc.)?
Also hitting this one, do we have an update on the timeline of the 4.4 patch release?
As far as I know, if nothing goes wrong, 4.4.2 should be released in ~2 weeks from now.
@lseelenbinder @forsberg I'm a bit late, but better late than never: 4.4.2 is out.
https://groups.google.com/g/scylladb-users/c/CHOY8HRUEGM
@kbr- Thanks for your work and the notification, already running it in my cluster! :+1:
Thanks @kbr-! 👍