Elasticsearch: S3 Repo plugin "path_style_access" option insufficient alone

Created on 17 Apr 2020  路  3Comments  路  Source: elastic/elasticsearch

Elasticsearch version (bin/elasticsearch --version):
Version: 7.6.0, Build: default/tar/7f634e9f44834fbc12724506cc1da681b0c3b1e3/2020-02-06T00:09:00.449973Z, JVM: 13.0.2

Plugins installed: [S3 repository plugin]

JVM version (java -version):
java version "1.8.0_231"
Java(TM) SE Runtime Environment (build 1.8.0_231-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.231-b11, mixed mode)

OS version (uname -a if on a Unix-like system):
Mac OS 10.14.6
18.7.0 Darwin Kernel Version 18.7.0: Thu Jan 23 06:52:12 PST 2020; root:xnu-4903.278.25~1/RELEASE_X86_64 x86_64

Description of the problem including expected versus actual behavior:
When creating an S3 repository by PUT _snapshot/my_s3_repository using the "path_style_access" option, both this option and the "endpoint" option must be specified in the same location, or this option does not take effect and we use the default bucket.endpoint DNS scheme instead of the expected endpoint/bucket path style access scheme.

These settings can be specified in either elasticsearch.yml file or in the API call. Any options in the API call _should_ override what's in the yml. If I specify one of "endpoint" or "path_style_access" in the yml and the other in the API call, we do not correctly use the path style access. If both are specified in the yml or both in the API call, we successfully change the access style.

The documentation at https://www.elastic.co/guide/en/elasticsearch/plugins/master/repository-s3-client.html does not mention this requirement.

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

  1. Set up ES with S3 Repository Plugin
  2. Add s3.client.CLIENT_NAME.endpoint: your_endpoint to elasticsearch.yml
  3. Run the following API call to add an S3 repository:
PUT /_snapshot/your_repository
{
  "type": "s3",
  "settings": {
    "bucket": "your_bucket",
    "path_style_access": "true",
    "client": "CLIENT_NAME"
  }
}

Expected Results
ES attempts to connect to bucket your_endpoint/your_bucket
Actual results
ES attempts to connect to your_bucket.your_endpoint, throws error because this URL does not exist.
Error message:

{
  "error": {
    "root_cause": [
      {
        "type": "repository_verification_exception",
        "reason": "[your_bucket] path is not accessible on master node"
      }
    ],
    "type": "repository_verification_exception",
    "reason": "[your_bucket] path is not accessible on master node",
    "caused_by": {
      "type": "i_o_exception",
      "reason": "Unable to upload object [tests-ngfk4rH8TWeQSysVW4oSbA/master.dat] using a single upload",
      "caused_by": {
        "type": "sdk_client_exception",
        "reason": "Unable to execute HTTP request: your_bucket.your_endpoint",
        "caused_by": {
          "type": "unknown_host_exception",
          "reason": "your_bucket.your_endpoint"
        }
      }
    }
  },
  "status": 500
}
  1. Run the following API call:
PUT /_snapshot/your_repository
{
  "type": "s3",
  "settings": {
    "bucket": "your_bucket",
    "path_style_access": "true",
    "endpoint": "your_endpoint"
    "client": "CLIENT_NAME"
  }
}

Results
ES correctly attempts to connect to your_endpoint/your_bucket

Note that when testing, your_endpoint and your_bucket don't have to actually exist. The error messaging will tell you what URL ES attempted to connect to, so you can see which form it is hitting.

Provide logs (if relevant):

:DistributeSnapshoRestore >bug

All 3 comments

There's some interplay between the endpoint yml setting and the bucket repository setting when the path_style_access is true which I don't completely follow, but I believe this is a bug.

Hi @original-brownbear :) Can you please take a look at this?

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

I opened https://github.com/elastic/elasticsearch/pull/55439 for this one. We have a bug here where if the path style access setting is the only setting overridden (from the client settings) in the repository settings, then it wasn't picked up (putting any other setting like e.g. the workaround as in the OP is a valid workaround here).

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ttaranov picture ttaranov  路  3Comments

martijnvg picture martijnvg  路  3Comments

malpani picture malpani  路  3Comments

clintongormley picture clintongormley  路  3Comments

abtpst picture abtpst  路  3Comments