Elasticsearch: Repository settings are not validated

Created on 12 Dec 2017  路  10Comments  路  Source: elastic/elasticsearch

Creating a repository consumes known settings, but does not error in invalid settings. This is especially a problem starting in 6.0 as we deprecated in 5.x and then removed in 6.0 a number of repository settings, including endpoint. This can be confusing for a user that had a working repo, but the setting is no longer valid, yet they get no error message.

:DistributeSnapshoRestore >bug Distributed help wanted

All 10 comments

We ran into this in the context of ECE 1.1.1 and ES6. Because ECE cannot update the keystore in-place, we use --Des.allow_insecure_settings=true to allow us to put the access and secret key in the repository (as opposed to the client) settings.

However, when customers attempt to use an s3-work-a-like repository, they need to set the endpoint, which is ignored when we set it in repository settings (if the customer adds a snapshot repo after an ES6 has been created). When attempting to validate the repository, ES6 does a HEAD on the using the AWS endpoint <bucket>.s3.amazonaws.com (or whatever for buckets in different regions). That response fails with a 403 because the user wants to hit the endpoint in the repository settings.

For other deployment scenarios (setting snapshot repo when creating the cluster), this isn't an issue since we configure the endpoint in the keystore.

We are running ECE 1.1.1 and are facing the issue described above when migrating a cluster (from ES5.6.4 to ES6.0.0) or when creating a new ES6.0.0 cluster. Our assumption is that the ES5.6.4 cluster can be created due to the fact that this version deprecates these secure settings, but does not remove (or ignore) them.

Based on the information provided by @stu-elastic this shouldn't be an issue for a new ES6 cluster, but we are facing the issue here as well. We believe we are failing for snapshots based on the HEAD for AWS endpoint <bucket>.s3.amazonaws.com since we are running Minio (an S3-work-a-like repository) internally.

We are also observing that if the snapshot repository cannot be connected to (or found), the cluster cannot be stopped unless snapshots are disabled before stopping cluster.

Instead of manually updating the items in the keystore after cluster creation, is there a way to at least use the configuration to create the keystore when a new cluster is created?

I have added our ES6 cluster configuration below (with some information obfuscated) as well as a screenshot from the startup of a new ES6 cluster in ECE 1.1.1 to this for review.

{
"type": "s3",
"settings": {
"access_key": "XXXXXXXXXXXXXXXXXXXXXXX",
"secret_key": "+XXXXXXXXXXXXXXXXXXXXX/XXXXXXXXX",
"bucket": "snapshots",
"region": "us-east-local",
"endpoint": "http://172.19.xx.xxx:9000",
"protocol": "http"
}
}

ece 1 1 1 - snapshot failure es6 aws

Is there a fix to this issue? And can anyone point me towards where I set allow_insecure_settings to true?

@emillykkejensen This issue is not yet fixed. I've marked it as adoptme as it I don't know of anyone working on this at the moment.

Regarding allow_insecure_settings, please ask questions like this on our forum. We use github for feature requests and confirmed bug reports. Given that, since the setting was mentioned above, I will respond here: you should look at jvm options documentation.

@rjernst sorry for poluting here - have asked on the forum - thanks :)

+1

Apart from having a consistent experience in Elasticsearch where we now validate all cluster and index level settings, here is another strong case for implementing validation of snapshot repository settings:

https://www.elastic.co/guide/en/elasticsearch/reference/6.2/modules-snapshots.html#_repositories

The readonly setting is a repository setting that does not conform to the usual naming convention of repository settings. Whether it is fs or aws (eg. https://www.elastic.co/guide/en/elasticsearch/plugins/current/repository-s3-repository.html), the repository level settings are always named with the convention <word1>_<word2>_so on, eg. buffer_size, chunk_size, storage_class, max_restore_bytes_per_sec, etc..

Because readonly doesn't use the underscore syntax, we have seen repeating occurrences of users specifying readonly setting as read_only (incorrect) because they are used to the naming convention of all the other repository level settings having underscores between words.

The issue here is that by specifying read_only settings, snapshot/restore ignores it without throwing errors due to the lack of validation. As a result (which tends to occur when performing/testing upgrades when clusters of different versions connect to the same snapshot repository location), a later versioned ES cluster can end up upgrading the snapshot repositories index generation files, and subsequently not able to read the older snapshots by an earlier ES cluster (eg. 2.x <-> 5.x snapshot metadata file changes). When this happens, it's already too late (even if they then go back and update the repository definitions to use the correct readonly setting) and it will require modification of the files within the repository to essentially manually rollback the upgrade of the index generation files done by the later version against the repository in order for the later ES cluster to read the older snapshots.

Having validation on the settings will be helpful here, so that users will go and double check why read_only doesn't work (i.e. should have been readonly).

Same here with ES 6.4.0: the endpoint settings is simply not used, and the following host name is used instead:
Caused by: java.net.UnknownHostException: test.s3.amazonaws.com

Despite the following:

PUT _snapshot/S3_Test
{
  "type": "s3",
  "settings": {
    "bucket": "test",
    "endpoint": "MyFQDN:8082",
    "protocol": "https"
  }
}

I also tested with some other syntaxes for the endpoint like "endpoint": "https://MyFQDN:8082"

We are using a local "StorageGRID" S3-compatible appliance.

Yeah I am seeing the same behavior as well with ES 6.2.1 s3 plugin where the endpoint parameter is ignored. I am using an S3-work-a-like repository internally and I get the

Caused by: java.net.UnknownHostException: mybucket.s3.amazonaws.com

This plugin was working for me back in version 2.7 using the internally hosted s3 repo we have.

You need to add something like that:
s3.client.default.enpoint: "appliance.test.local"
in the elasticsearch config (elasticsearch.yml) and remove the unsused.

And good to know that path-style has been removed and only host-style is now working.

That worked! thank you. strange the endpoint has to be in the elasticsearch.yml and is ignored in the document pushed up to ES.

Was this page helpful?
0 / 5 - 0 ratings