Scylla: Tracing: Slow Query Logging causes the "cached" budget component to undeflow

Created on 19 Oct 2016  路  10Comments  路  Source: scylladb/scylla

Scylla version (or git commit hash): 54069162f545f57c7031973a2479eb356dca1a2a
Cluster size: 3
AWS AMI: c3.8xlarge

_Description_
1) Enable a Slow Query Logging
2) Run a cassandra-stress: cassandra-stress read n=10000000 -node <address> -rate threads=500
3) See the collectd tracing statistics with scyllatop "*trac*". Note that the "cached_records" counter has a huge value.

bug

All 10 comments

The above is caused by a fact that a cached component of a tracing budget becomes "negative": we return more than we have consumed. And since this is an unsigned value it translates to a huge value.

This doesn't happen when a regular tracing is enabled, so there must be some logic error in a budget handling related to a Slow Query Logging.

I continue digging.

The issue is caused by the fact that the trace_state migrates to the other shard without using global_trace_state_ptr.

I'm looking for a specific place in a code where it happens now...

The abusing trace point is

tracing::trace(_trace_state, "Reading key {} from sstable {}", *_rp.key(), seastar::value_of([&sstable] { return sstable->get_filename(); }));

@duarten FYI ;)

Yikes! I thought that had been fixed with #1678 :/

Nope, I only fixed the issue in a storage_proxy I knew about. If you know about any other place, please, don't hesitate to share... ;)

The patch fixing THIS problem in on a list. I hope this is the last place like this... ;)

That seems to be the only missing one!

_Conclusion_
The issue was affecting not only the Slow Query Logging but also a regular Tracing and it was a luck that it didn't crash line in #1678.
So, I'd define this issue as critical and would suggest to merge it into the scylla-1.4 branch.

Looking at https://github.com/scylladb/scylla/commit/46b86ff80126c72b22a13d4245f3e11ab869c6ba, the following places in storage_proxy need the global_trace_state_ptr:

  • storage_proxy::query_singular_local
  • storage_proxy::query_mutations_locally
Was this page helpful?
0 / 5 - 0 ratings