Elasticsearch version (bin/elasticsearch --version):
Version: 7.6.0, Build: default/tar/7f634e9f44834fbc12724506cc1da681b0c3b1e3/2020-02-06T00:09:00.449973Z, JVM: 13.0.2
Plugins installed: [S3 repository plugin]
JVM version (java -version):
java version "1.8.0_231"
Java(TM) SE Runtime Environment (build 1.8.0_231-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.231-b11, mixed mode)
OS version (uname -a if on a Unix-like system):
Mac OS 10.14.6
18.7.0 Darwin Kernel Version 18.7.0: Thu Jan 23 06:52:12 PST 2020; root:xnu-4903.278.25~1/RELEASE_X86_64 x86_64
Description of the problem including expected versus actual behavior:
When creating an S3 repository by PUT _snapshot/my_s3_repository using the "path_style_access" option, both this option and the "endpoint" option must be specified in the same location, or this option does not take effect and we use the default bucket.endpoint DNS scheme instead of the expected endpoint/bucket path style access scheme.
These settings can be specified in either elasticsearch.yml file or in the API call. Any options in the API call _should_ override what's in the yml. If I specify one of "endpoint" or "path_style_access" in the yml and the other in the API call, we do not correctly use the path style access. If both are specified in the yml or both in the API call, we successfully change the access style.
The documentation at https://www.elastic.co/guide/en/elasticsearch/plugins/master/repository-s3-client.html does not mention this requirement.
Steps to reproduce:
Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.
s3.client.CLIENT_NAME.endpoint: your_endpoint to elasticsearch.ymlPUT /_snapshot/your_repository
{
"type": "s3",
"settings": {
"bucket": "your_bucket",
"path_style_access": "true",
"client": "CLIENT_NAME"
}
}
Expected Results
ES attempts to connect to bucket your_endpoint/your_bucket
Actual results
ES attempts to connect to your_bucket.your_endpoint, throws error because this URL does not exist.
Error message:
{
"error": {
"root_cause": [
{
"type": "repository_verification_exception",
"reason": "[your_bucket] path is not accessible on master node"
}
],
"type": "repository_verification_exception",
"reason": "[your_bucket] path is not accessible on master node",
"caused_by": {
"type": "i_o_exception",
"reason": "Unable to upload object [tests-ngfk4rH8TWeQSysVW4oSbA/master.dat] using a single upload",
"caused_by": {
"type": "sdk_client_exception",
"reason": "Unable to execute HTTP request: your_bucket.your_endpoint",
"caused_by": {
"type": "unknown_host_exception",
"reason": "your_bucket.your_endpoint"
}
}
}
},
"status": 500
}
PUT /_snapshot/your_repository
{
"type": "s3",
"settings": {
"bucket": "your_bucket",
"path_style_access": "true",
"endpoint": "your_endpoint"
"client": "CLIENT_NAME"
}
}
Results
ES correctly attempts to connect to your_endpoint/your_bucket
Note that when testing, your_endpoint and your_bucket don't have to actually exist. The error messaging will tell you what URL ES attempted to connect to, so you can see which form it is hitting.
Provide logs (if relevant):
There's some interplay between the endpoint yml setting and the bucket repository setting when the path_style_access is true which I don't completely follow, but I believe this is a bug.
Hi @original-brownbear :) Can you please take a look at this?
Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)
I opened https://github.com/elastic/elasticsearch/pull/55439 for this one. We have a bug here where if the path style access setting is the only setting overridden (from the client settings) in the repository settings, then it wasn't picked up (putting any other setting like e.g. the workaround as in the OP is a valid workaround here).