Elasticsearch: Disk watermark validation rejects apparently valid sequence of updates

Created on 19 Jan 2018 · 13Comments · Source: elastic/elasticsearch

$ elasticsearch-6.1.1/bin/elasticsearch --version
Java HotSpot(TM) 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
Version: 6.1.1, Build: bd92e7f/2017-12-17T20:23:25.338Z, JVM: 9.0.1
$ java -version
java version "9.0.1"
Java(TM) SE Runtime Environment (build 9.0.1+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.1+11, mixed mode)
$ uname -a
Darwin Davids-MacBook-Pro.local 17.3.0 Darwin Kernel Version 17.3.0: Thu Nov  9 18:09:22 PST 2017; root:xnu-4570.31.3~1/RELEASE_X86_64 x86_64

On a fresh one-node cluster, the first of these operations succeeds but the second fails:

PUT _cluster/settings
{"transient":{"cluster.routing.allocation.disk.watermark.flood_stage":"99%"}}

PUT _cluster/settings
{"transient":{"cluster.routing.allocation.disk.watermark.high":"97%"}}

The result is:

{
  "error": {
    "root_cause": [
      {
        "type": "illegal_argument_exception",
        "reason": "high disk watermark [97%] more than flood stage disk watermark [95%]"
      }
    ],
    "type": "illegal_argument_exception",
    "reason": "high disk watermark [97%] more than flood stage disk watermark [95%]"
  },
  "status": 400
}

This is surprising because the flood stage watermark was set to 99% but the error message claims it is still 95%. As expected, GET _cluster/settings yields:

{
  "persistent": {},
  "transient": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "flood_stage": "99%"
            }
          }
        }
      }
    }
  }
}

Setting both watermarks together succeeds:

PUT _cluster/settings
{
  "transient": {
    "cluster.routing.allocation.disk.watermark.flood_stage": "99%",
    "cluster.routing.allocation.disk.watermark.high": "97%"
  }
}

I see the same effect replacing transient with persistent throughout too.

:CorInfrSettings >bug help wanted v6.1.1

Source

DaveCTurner

👍1

Most helpful comment

Closed by #34184.

DaveCTurner on 7 Jan 2019

🎉2

All 13 comments

This also happens with high versus low and is still present in v6.2.3

GET _cluster/settings
{
  "persistent": {
    "cluster": {
      "routing": {
        "allocation": {
          "disk": {
            "watermark": {
              "low": "50%"
            }

```
PUT _cluster/settings
{
"persistent": {
"cluster.routing.allocation.disk.watermark.high": "66%"
}
}

Result:
{
"error": {
"root_cause": [
{
"type": "remote_transport_exception",
"reason": "[tw-r03-s01.master-elasticsearch-logging-staging.service.chcg01.consul][172.23.73.194:9300][cluster:admin/settings/update]"
}
],
"type": "illegal_argument_exception",
"reason": "low disk watermark [85%] more than high disk watermark [66%]"
},
"status": 400
}

dylanPowers on 18 Apr 2018

I can confirm this is still in 6.3.2 thanks to @Xylakant for pointing me here

faxm0dem on 10 Sep 2018

Looks interesting, I'd like to work on it if no one has already started to do so.

cbismuth on 28 Sep 2018

Here is a complete cURL recreation script, see Gist here.

cbismuth on 28 Sep 2018

Issue found and reproduced on master branch.

It happens that when setting only the high value, it's validated against other settings (i.e. low and flood_stage) not read from cluster state but only containing default values. Actual cluster state is enclosed in target parameter at line 719 below, but it isn't passed to the validate API.

To me, the real issue here is that we have dependencies between three parameters (low, high and flood_stage), and actual runtime cluster values aren’t passed to ensure these dependencies are met.

https://github.com/elastic/elasticsearch/blob/c74c46edc3c583d09e88ac5d56352fdccdedf5ab/server/src/main/java/org/elasticsearch/common/settings/AbstractScopedSettings.java#L702-L719

cbismuth on 28 Sep 2018

@jasontedor @rjernst @s1monw

I've reworked my comment on this issue to make it easier to read. Could you please have a look at it and tell us what do you think? Thanks a lot.

cbismuth on 28 Sep 2018

Hi @cbismuth, you can reproduce this in a unit test as follows:

diff --git a/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java b/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
index 342fcea7dde..b99c13e2d99 100644
--- a/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
+++ b/server/src/test/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettingsTests.java
@@ -203,4 +203,10 @@ public class DiskThresholdSettingsTests extends ESTestCase {
         assertThat(cause, hasToString(containsString("low disk watermark [85%] more than high disk watermark [75%]")));
     }

+    public void testSequenceOfUpdates() {
+        final ClusterSettings clusterSettings = new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS);
+        new DiskThresholdSettings(Settings.EMPTY, clusterSettings); // this has the effect of registering the settings updater
+        clusterSettings.applySettings(Settings.builder().put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_DISK_FLOOD_STAGE_WATERMARK_SETTING.getKey(), "99%").build());
+        clusterSettings.applySettings(Settings.builder().put(DiskThresholdSettings.CLUSTER_ROUTING_ALLOCATION_HIGH_DISK_WATERMARK_SETTING.getKey(), "97%").build());
+    }
 }

I'm not sure what you mean by "these dependencies aren't expressed in the code" - e.g. the LowDiskWatermarkValidator declares its dependency on the high and flood_stage settings here:

https://github.com/elastic/elasticsearch/blob/31aabe4bf9ea73cbb1c21322d9e9aa2a578b41a0/server/src/main/java/org/elasticsearch/cluster/routing/allocation/DiskThresholdSettings.java#L105-L110

DaveCTurner on 28 Sep 2018

Sorry, I'll reword. You're right @DaveCTurner, static dependencies are expressed in the code. But when triggering validation, runtime cluster state values (containing 99% as flood_stage) aren't passed to the validator. Therefore, the validator validates input against default values.

In my previous comment, the _runtime cluster state values_ are enclosed in the target parameter of AbstractScopedSettings#updateSettings API. And as you can see in the snippet I referenced, this parameter isn't passed line 719 to AbstractScopedSettings#validate validation API.

Is it clearer?

cbismuth on 28 Sep 2018

Therefore, the validator validates input against default values.

Right, that makes sense. I tried changing AbstractScopedSettings in a few sensible-looking ways and all sorts of other tests broke, but I did not put much time into working out whether it was because the tests or the change were wrong.

DaveCTurner on 28 Sep 2018

Yes, this API has a lot of usages. Thank you for the unit test, I’ll play with it :+1:

cbismuth on 28 Sep 2018

I had a look to available validation APIs in AbstractScopedSettings class, and I've opened PR https://github.com/elastic/elasticsearch/pull/34184 to suggest an enriched version of the actual validation.

No test seems to be broken on my laptop.

There may be a small overhead, but I let you give me your opinion on this.

Reported issue is fixed with these changes, and here is a sample output when high is higher than flood_stage.

{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      }
    ],
    "type" : "illegal_argument_exception",
    "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]",
    "suppressed" : [
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      },
      {
        "type" : "illegal_argument_exception",
        "reason" : "high disk watermark [98%] more than flood stage disk watermark [97%]"
      }
    ]
  },
  "status" : 400
}

cbismuth on 1 Oct 2018

@DaveCTurner I've added the unit test you suggested.
I had to slightly modify it by reusing returned settings from the AbstractScopedSettings#applySettings API. Otherwise we couldn't have kept track of updated settings in a unit test.

Does it make sense to you or did I miss something?

cbismuth on 4 Oct 2018

Closed by #34184.

DaveCTurner on 7 Jan 2019

🎉2

Was this page helpful?

0 / 5 - 0 ratings