Scylla: Read timeout should be propagated to replica side

Created on 2 Nov 2017  Â·  21Comments  Â·  Source: scylladb/scylla

Currently replica-side reads don't time out, reads only time out on coordinator side. If a read takes more than a timeout, and clients retry, reads may keep accumulating on replica side, making overload worse than it should be. We should propagate the timeout and expire reads, like we do for writes.

Field-Tier1 bug overload stability

All 21 comments

To make the scope more clear we already propagate timeout values to
replica side all is left is a hard part to actually honor the timeout
in databse::query().

--
Gleb.

@tgrabiec isn't this is handled by the per-request timeout patchset ? Am I missing something ?

@glommer The work which has already been implemented for this will expire reads on replica side if they are blocked on sstable read concurrency limiter. They still won't expire after admitted.

On Sun, Apr 29, 2018 at 10:26:28AM -0700, Tomasz Grabiec wrote:

@glommer The work which has already been implemented for this will expire reads on replica side if they are blocked on sstable read concurrency limiter. They still won't expire after admitted.

What expiration points do we still have that do not honor the timeout?

--
Gleb.

What do you mean by "expiration point"? Actual reading from sstables doesn't respect the timeout.

On Mon, Apr 30, 2018 at 12:32:21AM -0700, Tomasz Grabiec wrote:

What do you mean by "expiration point"? Actual reading from sstables doesn't respect the timeout.

I mean points where a read can stuck without timeout. What is "actual
reading"? A read submitted to IO queue? Actually sstable read may
require more than one IO, right? So the code that tracks how long the
overall process takes is missing?

--
Gleb.

2018-04-30 9:38 GMT+02:00 Gleb Natapov notifications@github.com:

On Mon, Apr 30, 2018 at 12:32:21AM -0700, Tomasz Grabiec wrote:

What do you mean by "expiration point"? Actual reading from sstables
doesn't respect the timeout.

I mean points where a read can stuck without timeout. What is "actual
reading"? A read submitted to IO queue? Actually sstable read may
require more than one IO, right? So the code that tracks how long the
overall process takes is missing?

Yes, a read may do many IOs, on many files. Some of them in parallel, some
of them sequentually. We will not timeout at any point once the read is
admitted to start reading from sstables, past the concurrency limiting
queue.

We also saw reads queueing inside the data_query_stage execution stage when CPU is overloaded. That queue doesn't time out.

We could wrap file with something that times out after a certain point. But it wouldn't play well with #1865 caches.

I don't think this can help much. At any point in time, most reads in a loaded system are queued, not running, so this will affect only a small number of queries.

I think a better way would be setting a timer in the same layer that is doing the querier caching, then pass down a cancellation handle all the way down to seastar I/O. The timer, when triggered, cancels the read, including all outstanding I/O requests. If the page was filled and the reader is saved we kill the timer.

There is some more discussion about this on the duplicate issue on #7340 (which I'm going to close as a duplicate) , the highlights of it is:

  1. We will need I/O cancellation support in (seasrar scylladb/seastar#811)
  2. It can create a live lock where reads accumulate to such a long queue (on the read concurrency semaphore) that all of the
    reads will become stale at the moment of execution.

@vladzcloudius FYI

@amihayday @slivne @denesb @eliransin @eyalgutkind
Please, let's give it the highers priority: this issue is hurting us a lot all over the place.

@amihayday @slivne @denesb @eliransin @eyalgutkind
Please, let's give it the highers priority: this issue is hurting us a lot all over the place.

@eliransin @slivne ping

@bhalevy the work you and @xemul are doing will cover this - right ?

@slivne no, not yet.

I'll rephrase

I know that we are not done - in the sense that you still have some patches - but what I meant as following your changes - will this be closed or are we missing something in addition to that.

I'll rephrase

I know that we are not done - in the sense that you still have some patches - but what I meant as following your changes - will this be closed or are we missing something in addition to that.

I was under the assumption that timeouts were not conveyed to the replica side but apparently they are (https://github.com/scylladb/scylla/issues/2927#issuecomment-341376890), and all was left was to actually honor them.
Said that, Pavel's fix for https://github.com/scylladb/seastar/issues/859, with my project to close flat_mutation_reader:s will do that and it indeed seems like there's nothing more missing.

@denesb, @tgrabiec please confirm.

I didn't see the code in your close series which actually takes action when a reader is timed out. Your series lays the groundwork for doing so, yes, but is not enough on its own.
We will still need to glue all this together. Set up a timeout timer and when this timer fires we cancel all I/O and propagate up an exception terminating any in-progress operation on the reader, after which we close and destroy them.

Well, the timeouts on r_c_s should already work, no? As well as the timeout parameter propagated to in-progress queries?

The timeout on the r_c_s does already work, but after passing the semaphore
we hav no timeout at all effectively.

On Thu, Feb 25, 2021, 18:29 Avi Kivity notifications@github.com wrote:

Well, the timeouts on r_c_s should already work, no? As well as the
timeout parameter propagated to in-progress queries?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/scylladb/scylla/issues/2927#issuecomment-786030487,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAKTFWINTWBXBJWOIYPKSUDTAZ3IBANCNFSM4EB5FNJQ
.

Was this page helpful?
0 / 5 - 0 ratings