Elasticsearch: Should forced merges be IO throttled?

Created on 27 Sep 2015 · 12Comments · Source: elastic/elasticsearch

With the new auto-IO throttling in Lucene's ConcurrentMergeScheduler as of Lucene 5.0 (ES 2.0), forced merges (optimize) are never throttled by default.

I think this is fair: the user asked for the index to merge down to 1 (default) segment so it should happen as quickly as possible. And users generally should force merge only when the index is otherwise idle ... 5.0 has been out for a while now and I haven't heard user complaints about this.

CMS has a setting to change the default MB/sec for forced merges, but we haven't exposed it in ES, and this is a behavior change from before (in ES 1.x) when all merges IO (natural and forced) were held to the shared default of 20 MB/sec.

Note that users can still set the old IO store throttling (indices/index.store.throttle.max_bytes_per_sec) and it will still apply to all merges.

So we could do nothing here, and if that turns out to be a problem, users can still use the old way (but we need to re-document this if so).

Or we can open up a MB/sec setting for forced merges ... I don't really like that: ES has too many settings, and "MB/sec" is too raw/low-level for a user who's kicking off a forced merge to necessarily know the right value for.

Or we could hardwire the force merge MB/sec to be a function of the current natural merge MB/sec auto IO throttle, maybe just ==.

:CorInfrCore discuss

Source

mikemccand

Most helpful comment

I also would love to be able to throttle this, we have a number of small clusters, so we do not have separate nodes for old data. As there is also no way to see/cancel ongoing merges, we are pretty blind here. Start off an optimize and hope it will not kill search/index operations...

I know it is documented this way, but it still is not ideal as we also have older indices that we would like to force-merge to one segment but we do not care how long this takes, for us this is also even lower prio than natural merges.

centic9 on 25 Jan 2017

👍2

All 12 comments

I think this is fair: the user asked for the index to merge down to 1 (default) segment so it should happen as quickly as possible.

+1, they asked to do this.

rmuir on 27 Sep 2015

I'm fine with the default being no throttle but an option to throttle might
still be useful for folks serving queries from one index while force
merging another. Or maybe they are force merging on yesterday's index but
they still want to keep their index rate high for today's data.

Nik
On Sep 27, 2015 6:01 PM, "Robert Muir" [email protected] wrote:

I think this is fair: the user asked for the index to merge down to 1
(default) segment so it should happen as quickly as possible.

+1, they asked to do this.

—
Reply to this email directly or view it on GitHub
https://github.com/elastic/elasticsearch/issues/13819#issuecomment-143571158
.

nik9000 on 27 Sep 2015

+1 to not throttle explicit merges. Like Mike said, we are already drowning under settings so I think it's important not to introduce a new one.

jpountz on 27 Sep 2015

I understand the concerns about not adding a settings. I also agree that in some cases one might want to get the force merge done as soon as possible, as there is a human waiting on it. I think the most common one here is when people actually run the upgrade api. On the other hand in many other cases, operator do a maintenance job in the background while users of the cluster keep on searching. For example - and I believe that's practically the single case where we _recommend_ running a force merge - when ones moves yesterday's data to cold nodes in the time based data scenario. I think it will be a great shame if this stress nodes for their IO/CPU

bleskes on 28 Sep 2015

A merge (forced or otherwise) should never kill your search application. Ideally, a forced merge run on a quiet box should be able to use all available resources, but when run on a busy box, it should know how to play nice with the other resource consumers.

That'd be ideal but, if i understand correctly, then the auto-IO throttling only works with indexing, not with search?

clintongormley on 28 Sep 2015

That'd be ideal but, if i understand correctly, then the auto-IO throttling only works with indexing, not with search?

Correct: it watches for natural merge "backlog" (more than one merge of nearly the size wants to run). It doesn't look at any search metrics.

In the ideal world, ionice would be available from Java on all OS's per-thread, so we'd be able to just set low priority for natural merges, slightly higher priority for forced ones, and top priority for search IO ... but we are not there yet :)

Maybe we could add a boolean setting, throttle_force_merges or something, default to false, or maybe a dynamic default based on IOUtils.spins, and if you change that to true, all forced merges will be throttled at the same rate as natural merges?

mikemccand on 28 Sep 2015

I'd be happy with a parameter that defaulted to not throttling. That is a
breaking change from 1.0 and needs docs and stuff but its cool.
On Sep 28, 2015 12:50 PM, "Michael McCandless" [email protected]
wrote:

That'd be ideal but, if i understand correctly, then the auto-IO
throttling only works with indexing, not with search?

Correct: it watches for natural merge "backlog" (more than one merge of
nearly the size wants to run). It doesn't look at any search metrics.

In the ideal world, ionice would be available from Java on all OS's
per-thread, so we'd be able to just set low priority for natural merges,
slightly higher priority for forced ones, and top priority for search IO
... but we are not there yet :)

Maybe we could add a boolean setting, throttle_force_merges or something,
default to false, or maybe a dynamic default based on IOUtils.spins, and
if you change that to true, all forced merges will be throttled at the
same rate as natural merges?

—
Reply to this email directly or view it on GitHub
https://github.com/elastic/elasticsearch/issues/13819#issuecomment-143707806
.

nik9000 on 28 Sep 2015

We do tell people not to do forced merges on hot boxes (and I wasn't aware that they were throttled in current versions). I'd be tempted to leave the setting for now. It can always be added if users ask for it.

clintongormley on 29 Sep 2015

I'd be tempted to leave the setting for now. It can always be added if users ask for it.

OK let's just leave it as is for now (forced merges not throttled) and revisit this if it proves to be a problem

mikemccand on 29 Sep 2015

I would very much like the ability to throttle forced merges.

I just want to merge indices that aren't going to receive any more writes down to 1 segment - I don't care how long that takes. It is lower priority than natural merges, even.

Moving shards around is a hassle, and has significant costs associated with it (extra nodes, storage, data transfer).

At the moment, we pick a quiet(er) time of day, disable all alerting for half an hour (!), hit optimize and hope for the best, which is obviously a terrible solution.