Elasticsearch version (bin/elasticsearch --version): 6.1.1
Plugins installed: N/A
JVM version (java -version): openjdk version "1.8.0_151"
OS version (uname -a if on a Unix-like system): 4.13.16-100.fc25.x86_64
Description of the problem including expected versus actual behavior:
The method search of the RestHighLevelClient throws a checked IOException which needs to be caught and handled by the caller. However, an invalid query or any problem along the way can also trigger a runtime exception ElasticsearchException.
Is this intentional or a bug? What do the different exceptions mean and how should they be handled? Should I simply catch both and handle them both as a failure in my application? That seems rather ugly and arbitrary having to define 2 different exceptions around every call to the RestHighLevelClient.
To further complicate things, a ResponseException is a subclass of IOException. When caused by an invalid exception, this exception is caught somewhere deep inside the ES client code, and seemingly converted to an unchecked, runtime ElasticsearchException which is then raised up further.
The documentation, which is usually very clear about the RestHighLevelClient methods, makes no mention about proper exception handling. Could this be added at some point?
Steps to reproduce:
Take this snippet without the client creation boilerplate. client is a RestHighLevelClient. Make sure index is a pre-existing index with a numeric typed field name. (In my case 'example' and 'salary', respectively)
SearchSourceBuilder source = new SearchSourceBuilder();
// Field is a numeric type, value is Long.MAX_VALUE + 1
source.query(QueryBuilders.rangeQuery(field).from("0").to("9223372036854775808"));
try {
client.search(new SearchRequest(new String[] { index }, source));
} catch (IOException e) {
e.printStackTrace();
}
This throws an unchecked runtime ElasticsearchException. It feels like it is completely useless to catch the IOException, since that never triggers.
Consider the following console log output:
Exception in thread "main" ElasticsearchStatusException[Elasticsearch exception [type=search_phase_execution_exception, reason=all shards failed]]
at org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177)
at org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:573)
at org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:549)
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:456)
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:429)
at org.elasticsearch.client.RestHighLevelClient.search(RestHighLevelClient.java:368)
at ESException.main(ESException.java:21)
Suppressed: org.elasticsearch.client.ResponseException: method [GET], host [http://localhost:9200], URI [/example/_search?typed_keys=true&ignore_unavailable=false&expand_wildcards=open&allow_no_indices=true&search_type=query_then_fetch&batched_reduce_size=512], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"failed to create query: {\n \"range\" : {\n \"salary\" : {\n \"from\" : \"0\",\n \"to\" : \"9223372036854775808\",\n \"include_lower\" : true,\n \"include_upper\" : true,\n \"boost\" : 1.0\n }\n }\n}","index_uuid":"XgTO5t6OQNKmL_b4dnphyA","index":"example"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"example","node":"nopt0UPpQWuew42N0ygbNg","reason":{"type":"query_shard_exception","reason":"failed to create query: {\n \"range\" : {\n \"salary\" : {\n \"from\" : \"0\",\n \"to\" : \"9223372036854775808\",\n \"include_lower\" : true,\n \"include_upper\" : true,\n \"boost\" : 1.0\n }\n }\n}","index_uuid":"XgTO5t6OQNKmL_b4dnphyA","index":"example","caused_by":{"type":"illegal_argument_exception","reason":"Value [9223372036854775808] is out of range for a long"}}}]},"status":400}
at org.elasticsearch.client.RestClient$1.completed(RestClient.java:357)
at org.elasticsearch.client.RestClient$1.completed(RestClient.java:346)
at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119)
at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436)
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326)
at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
at java.lang.Thread.run(Thread.java:748)
You can see the actual checked subclass of IOException, ResponseException is surpressed and instead a runtime ElasticsearchException is thrown.
Right now, it seems like the only way to handle exceptions properly, is very ugly, mixing both checked and unchecked exceptions together. Is there a better (intended?) way or is this a bug of the new client?
try {
client.search(new SearchRequest(new String[] { index }, source));
} catch (IOException | ElasticsearchException e) {
// handle your exception
}
cc @elastic/es-search-aggs
I ran into a similar behavior the exception reporting is inconsistent. I ended up putting in some logic to extract the details of the exceptions and still working to tweak this a bit:
public void handleException(ActionRequest actionRequest, Exception e) {
// Some exceptions are suppressed due to try-with-resources
for (Throwable t : e.getSuppressed()) {
if (t instanceof ResponseException) {
ResponseException responseException = (ResponseException) t;
try {
JsonNode exceptionNode =
objectMapper.readTree(responseException.getResponse().getEntity().getContent());
Optional<JsonNode> rootCause = Optional.ofNullable(exceptionNode.get("error"))
.map(errorNode -> errorNode.get("root_cause"));
Optional<JsonNode> causedBy = searchForEntity(exceptionNode, "caused_by");
String type = getNodeTextValue(causedBy, rootCause, "type");
String reason = getNodeTextValue(causedBy, rootCause, "reason");
LOG.error("Error executing action - cause type: {}, reason: {}, action: {}",
type, reason, actionRequest.getClass(), e);
throw new SearchException(responseException.getResponse().getStatusLine().getStatusCode(), t);
} catch (IOException ioe) {
LOG.warn("Unable to convert response exception content", ioe);
}
}
}
LOG.error("Error executing elasticsearch action {}", actionRequest.getClass(), e);
throw new SearchException(HttpStatus.INTERNAL_SERVER_ERROR.value(), e);
}
Pinging @elastic/es-core-infra
The method search of the RestHighLevelClient throws a checked IOException which needs to be caught and handled by the caller. However, an invalid query or any problem along the way can also trigger a runtime exception ElasticsearchException.
So, in theory, the IOException in general and the ResponseException in particular are meant to communicate that the remote Elasticsearch cluster failed. The ElasticsearchException is kind of a side effect of reusing the Request objects from the transport client which are in turn reused from the server code. I think it is fairly reasonable for us to throw it if you build the request in an invalid way. Or for us to throw some unchecked exception. It'd probably be better if it were more specific but ElasticsearchException is what we have right now.
If we're throwing ElasticsearchException when the server does something wrong that seems like something we should fix sooner rather than later.
@javanna, what do you think?
To clarify the behaviour: ResponseException is what the low-level client throws whenever a 4xx or 5xx error code is returned. In such cases, the high-level client also tries to parse the response body, as Elasticsearch returns the error details, into a generic ElasticsearchException and throws that one, hence suppresses the original ResponseException. This is done consciously, I do not see it as a bug but I am curious to hear what we could do differently.
Although ResponseException will always appear as suppressed and an ElasticsearchException will be thrown instead, IOException can still be thrown in case the request times out and similar cases where we don't have a response coming back from Elasticsearch.
We may also rarely throw IOException in case we fail parsing the response back in the high-level REST client code, which is to be considered a bug if it happens against the supported Elasticsearch versions.
@mveitas I don't follow why you need to parse the response body returned with the suppressed ResponseException, that is info that should already be returned with the ElasticsearchException, the idea is that the parsing is all done in the high-level REST client, that is why we suppress the low-level ResponseException and return a new exception in its place. Can you clarify?
on having to catch two different exceptions, I actually find this good, as they are thrown in two very different cases like I explained above. On ElasticsearchException being unchecked, we just went with what we had in the transport client while for communication errors we went with what the low-level REST client and underlying apache http client already throw.
@javanna The ResponseException is thrown whenever the status is not 2xx (from https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-low-usage-responses.html), as opposed to only for 4xx and 5xx as you have mentioned.
The reason why we need to parse the ResponseException is because that is the place where you get the status code (2xx, 3xx, 4xx, etc.), which is not passed on directly as a retrievable field in ElasticsearchException(?).
I need to read the status codes to understand when to retry the call (ElasticsearchException isn't categorised into Retryable or Non-Retryable exception). One recommendation I am following for retrying is from here: https://discuss.elastic.co/t/knowing-when-and-when-not-to-retry-a-request-based-on-elasticsearchexception-or-ioexception-with-the-resthighlevelclient/183779/2
Other than parsing the Low level client's ResponseException, I don't think there is any other way to read the status (error) codes?
The ResponseException is thrown whenever the status is not 2xx (from https://www.elastic.co/guide/en/elasticsearch/client/java-rest/current/java-rest-low-usage-responses.html), as opposed to only for 4xx and 5xx as you have mentioned.
That is what I meant, but for simplicity I only mentioned the most common categories of status codes: 2xx. 4xx and 5xx.
I see what you mean: ElasticsearchException allows you to retrieve the status but it's not the correct one. Though you should be able to retrieve the original ResponseException from the suppressed exceptions, without needing to parse any json?
@javanna Yes. That is what I meant! Also, is there any plan to add the same error codes to the HighLevelRestClient as well, to have the correctness?
Just to clarify, there exists a status() method in the ElasticsearchException. But, the status is set in code, rather than passing on the status from the underlying Low Level Client (?) (which is present in the suppressed ResponseException). So, this is not the exact status which ES is passing on.
For reference, this is the status method, where the status codes are implied through custom logic: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/ElasticsearchException.java#L224
@javanna Do you think the actual status code should also be a part of the ElasticsearchException, for correctness? I think that there maybe be some reason for the current behaviour, which the original contributors must have thought about. But, I am not able to arrive at that reason! Would love to hear those reasons!
@r862 I don't follow, if a ResponseException is thrown you should be getting back an ElasticsearchStatusException, like in the stacktrace above, hence calling status() should return the proper status code. If that is not the case, would you mind opening a new issue and post more info about what happens in your case please. including your stacktrace? Thanks!
Most helpful comment
I ran into a similar behavior the exception reporting is inconsistent. I ended up putting in some logic to extract the details of the exceptions and still working to tweak this a bit: