Scylla: Useless linearization on large data during validation, of either type bytes or string-derived, potentially cause stalls due to reclaiming

Created on 1 Oct 2020  Â·  19Comments  Â·  Source: scylladb/scylla

The type bytes_type_impl doesn't need any validation, and so no linearization. But due to a regression introduced in commit b175657ee7936a7c5ec7b4d4bd2a93617370afb0, bytes_typoe_impl no longer overrides validate(const fragmented_temporary_buffer::view&, ...), Meaning linearization is performed on bytes type even though it needs no validation. Found out this while auditing code (comparing old versions with the current one)

Large data here means data > 128k, as that's the default fragment size used by fragmented_temporary_buffer. Data smaller than that will not need any linearization.

Also, string-derived types are performing linearization needlessly, given that validation checks individual bytes and that could be done on the fragmented buffer instead.

Linearization can try to allocate a large contiguous buffer to place large data, potentially triggering memory reclaiming and causing reactor stalls.

Follow a stall backtrace coming from that linearization on a string-derived type:

 (inlined by) _M_invoke at /opt/scylladb/include/c++/7/bits/std_function.h:302
std::function<seastar::memory::reclaiming_result ()>::operator()() const at /opt/scylladb/include/c++/7/bits/std_function.h:706
 (inlined by) seastar::memory::reclaimer::do_reclaim() at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.hh:124
 (inlined by) seastar::memory::cpu_pages::run_reclaimers(seastar::memory::reclaimer_scope) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:1013
 (inlined by) seastar::memory::cpu_pages::find_and_unlink_span_reclaiming(unsigned int) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:555
 (inlined by) seastar::memory::cpu_pages::allocate_large_and_trim(unsigned int) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:578
seastar::memory::cpu_pages::allocate_large(unsigned int) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:624
 (inlined by) seastar::memory::allocate_large(unsigned long) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:1194
 (inlined by) seastar::memory::allocate(unsigned long) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:1244
operator new(unsigned long) at /usr/src/debug/scylla-enterprise-2019.1.12/seastar/core/memory.cc:1647
__gnu_cxx::new_allocator<char>::allocate(unsigned long, void const*) at /opt/scylladb/include/c++/7/ext/new_allocator.h:111
 (inlined by) std::string::_Rep::_S_create(unsigned long, unsigned long, std::allocator<char> const&) at /opt/scylladb/include/c++/7/bits/basic_string.tcc:1057
 (inlined by) std::string::_Rep::_M_clone(std::allocator<char> const&, unsigned long) at /opt/scylladb/include/c++/7/bits/basic_string.tcc:1078
 (inlined by) std::string::reserve(unsigned long) at /opt/scylladb/include/c++/7/bits/basic_string.tcc:960
std::basic_string<char, std::char_traits<char>, std::allocator<char> > boost::locale::conv::utf_to_utf<char, signed char>(signed char const*, signed char const*, boost::locale::conv::method_type) at /opt/scylladb/include/boost/locale/encoding_utf.hpp:37
string_type_impl::validate(std::experimental::fundamentals_v1::basic_string_view<signed char, std::char_traits<signed char> >) const at /usr/src/debug/scylla-enterprise-2019.1.12/types.cc:333
abstract_type::validate(fragmented_temporary_buffer::view const&) const::{lambda(std::experimental::fundamentals_v1::basic_string_view<signed char, std::char_traits<signed char> >)#1}::operator()(std::experimental::fundamentals_v1::basic_string_view<signed char, std::char_traits<signed char> >) const at /usr/src/debug/scylla-enterprise-2019.1.12/./types.hh:470
 (inlined by) _Z15with_linearizedIN27fragmented_temporary_buffer4viewEZNK13abstract_type8validateERKS1_EUlNSt12experimental15fundamentals_v117basic_string_viewIaSt11char_traitsIaEEEE_EDcRKT_OT0_ at /usr/src/debug/scylla-enterprise-2019.1.12/utils/fragment_range.hh:137
abstract_type::validate(fragmented_temporary_buffer::view const&) const at /usr/src/debug/scylla-enterprise-2019.1.12/./types.hh:469
 (inlined by) cql3::constants::marker::bind_and_get(cql3::query_options const&) at /usr/src/debug/scylla-enterprise-2019.1.12/cql3/constants.hh:172
cql3::constants::setter::execute(mutation&, clustering_key_prefix const&, cql3::update_parameters const&) at /usr/src/debug/scylla-enterprise-2019.1.12/cql3/constants.hh:194
cql3::statements::update_statement::execute_operations_for_key(mutation&, clustering_key_prefix const&, cql3::update_parameters const&, std::optional<std::unordered_map<seastar::basic_sstring<char, unsigned int, 15u, true>, std::experimental::fundamentals_v1::optional<seastar::basic_sstring<signed char, unsigned int, 31u, false> >, std::hash<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::equal_to<seastar::basic_sstring<char, unsigned int, 15u, true> >, std::allocator<std::pair<seastar::basic_sstring<char, unsigned int, 15u, true> const, std::experimental::fundamentals_v1::optional<seastar::basic_sstring<signed char, unsigned int, 31u, false> > > > > > const&) at /usr/src/debug/scylla-enterprise-2019.1.12/cql3/statements/update_statement.cc:112
bug latency performance

Most helpful comment

I think overriding validate() would be a step back here, instead I'm going to remove the eager linearization from:

512      virtual void validate(const fragmented_temporary_buffer::view& view, cql_serialization_format sf) const {
   1         with_linearized(view, [this, sf] (bytes_view bv) {
   2             validate(bv, sf);
   3         });
   4     }

and have individual validating methods in the validator visitor decide whether they need linearising or not.

All 19 comments

This following patch will restore the old behavior where linearization on bytes (blob) type is avoided:

diff --git a/concrete_types.hh b/concrete_types.hh
index c80e5dd08..c3bd920a1 100644
--- a/concrete_types.hh
+++ b/concrete_types.hh
@@ -119,6 +119,7 @@ struct utf8_type_impl final : public string_type_impl {

 struct bytes_type_impl final : public concrete_type<bytes> {
     bytes_type_impl();
+    virtual void validate(const fragmented_temporary_buffer::view&, cql_serialization_format) const override { }
 };


Good catch!

@raphaelsc is out this week @bhalevy can you please have someone send this fix (we have issues in the field because of that - especially when users have large blobs)

@denesb can you please look into this? thanks

@raphaelsc did you close the issue on purpose?
If so, please provide more information.

@raphaelsc did you close the issue on purpose?
If so, please provide more information.

No, by mistake. Sorry

I think overriding validate() would be a step back here, instead I'm going to remove the eager linearization from:

512      virtual void validate(const fragmented_temporary_buffer::view& view, cql_serialization_format sf) const {
   1         with_linearized(view, [this, sf] (bytes_view bv) {
   2             validate(bv, sf);
   3         });
   4     }

and have individual validating methods in the validator visitor decide whether they need linearising or not.

I think overriding validate() would be a step back here, instead I'm going to remove the eager linearization from:

512      virtual void validate(const fragmented_temporary_buffer::view& view, cql_serialization_format sf) const {
   1         with_linearized(view, [this, sf] (bytes_view bv) {
   2             validate(bv, sf);
   3         });
   4     }

and have individual validating methods in the validator visitor decide whether they need linearising or not.

LGTM. Thanks for working on this, @denesb!

@avikivity @slivne Please, consider backporting this to all supported releases.

@eliransin @eyalgutkind @amihayday FYI

Btw you're also interested in #7393, which was extracted out of this issue
into a new one

On Wed, Oct 28, 2020, 8:36 PM vladzcloudius notifications@github.com
wrote:

@avikivity https://github.com/avikivity @slivne
https://github.com/slivne Please, consider backporting this to all
supported releases.

@eliransin https://github.com/eliransin @eyalgutkind
https://github.com/eyalgutkind @amihayday https://github.com/amihayday
FYI

—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://github.com/scylladb/scylla/issues/7318#issuecomment-718268506,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAKYA4YRLP4SRUC6BOGRUWTSNCTHLANCNFSM4SA7JUWA
.

Also, #7457 still needs to be fixed for this linearization issue to be really fixed. But backporting these available fixes will already help reducing the pressure on memory allocator. I plan to fix #7457 soon

This isn't really a regression, the original code also used bytes_view. So not backporting.

This isn't really a regression, the original code also used bytes_view. So not backporting.

@avikivity Don't you think that it's worth backporting even if it's not a regression considering the comment https://github.com/scylladb/scylla/issues/7318#issuecomment-718287982 ?

This isn't really a regression, the original code also used bytes_view. So not backporting.

Actually it's a regression. The original code also used bytes_view in the validator, but before devirtualizing the types, we had bytes_type_impl overriding virtual void validate(const fragmented_temporary_buffer::view&, cql_serialization_format) const override { }, so the validation function which receives a bytes_view was never called for bytes type, meaning no linearization was performed.

Given the latency impact on users produced by linearization on validation, I think the fix for this issue in addition to the one for https://github.com/scylladb/scylla/issues/7393 are worth of backport.

Ah, I looked at strings (which always need validation).

You're right that it is a regression. But won't we have to linearize in anyway, given that we'll convert it to bytes soon?

Ah, I looked at strings (which always need validation).

You're right that it is a regression. But won't we have to linearize in anyway, given that we'll convert it to bytes soon?

If I understand correctly, the blob will be kept fragmented (within managed_bytes) in memtable, but it will indeed be linearized as soon as it reaches the SSTable layer. So we need https://github.com/scylladb/scylla/issues/7457 fixed too (we need to extend its scope to writer too, not only parser)

Ah, I looked at strings (which always need validation).
You're right that it is a regression. But won't we have to linearize in anyway, given that we'll convert it to bytes soon?

If I understand correctly, the blob will be kept fragmented (within managed_bytes) in memtable, but it will indeed be linearized as soon as it reaches the SSTable layer. So we need #7457 fixed too (we need to extend its scope to writer too, not only parser)

FWIW, confirmed this with large mem allocation detector, a large blob is only linearized when memtable is flushed into a sstable. Backporting these available fixes will reduce the # of times we linearize a given large blob from 2 to 1, which alleviates the pain to some extent, but we need to fix #7457 for the problem to be completely fixed.

Backported to 4.1, 4.2 (4.3 has it already)

Was this page helpful?
0 / 5 - 0 ratings