Elasticsearch: Making RestHighLevelClient async requests abortable

Created on 24 Jul 2019  Â·  21Comments  Â·  Source: elastic/elasticsearch

Describe the feature:

I was looking at how the asynchronous requests work in the RestHighLevelClient and noticed that there currently is no way to abort a request. Apache HttpClient has an abort() method but the Apache request is burried in an InternalRequest object in the RestHighLevelClient and not exposed.

What I'm looking for is a way to use Kotlin's suspendCancellableCoRoutine (see here for an explanation) with the client in my kotlin client. See here for how that works currently (uses the non cancellable suspendCoRoutine).

The advantage of this would be that Kotlin cancels the other subtasks in a co-routine context when one of them throws an error. The advantage of this would be not creating more connections to a hanging Elasticsearch in case you are doing multiple asynchronous calls, freeing up file handles of already running connections that are no longer relevant, etc.

From my limited understanding of the code base, it seems the abort method could be exposed from the InternalRequest, which then in turn could be exposed to the ActionRequest. So the user could create a request and then abort it later. And additional change that would make sense is to add an onAbort to the ActionListener interface.

:CorFeatureJava High Level REST Client >enhancement

Most helpful comment

Heya I am looking at this, as it may also be useful for #43332 where we are exposing automatic cancellation of search tasks whenever the underlying connection gets closed. It would be nice to expose the ability to abort a request using the java client, though there's a couple of steps to get there if we want to expose this to both low-level and high-level clients. Will keep you posted.

All 21 comments

Pinging @elastic/es-core-features

I looked some more; the easiest way to do this would be to simply return the Optional<HttpRequestBase> instance from all of the *Async methods. These currently return void so this should not pose any compatibility issues since the user can safely ignore the return value unless they want to abort it for whatever reason. It would return Optional.empty in case the request never executed because of validation issues and the running request otherwise. The user can then decide to call abort on this if needed.

For the callbacks from the Async methods a few modifications would be needed:

In org.elasticsearch.client.RestClient#performRequestAsync(...) there's a FutureCallback with a cancelled method that simply calls org.elasticsearch.client.RestClient.FailureTrackingResponseListener#onDefinitiveFailure, which calls org.elasticsearch.client.ResponseListener#onFailure.

I don't think the cancelled method is ever called because the future is not used anywhere. But technically it should call abort on the request. Additionally, when abort is called there should be onAbort handlers in the listeners.

I think I could do a PR for this. However, there are a couple of hundred of these calls in the different high level clients. So before I do this, I would like some consensus on whether this is desirable or if there are any concerns.

I think you analysis is right, but I don't think it should return an Optional<HttpRequestBase>.
Looking to the code the rest client calls the apache http async client that has already a cancellation handle (that is what you mention that it's likely that is not actually cancelled).

   /**
     Initiates asynchronous HTTP request execution using the default
     * context.
     * <p>
     * The request producer passed to this method will be used to generate
     * a request message and stream out its content without buffering it
     * in memory. The response consumer passed to this method will be used
     * to process a response message without buffering its content in memory.
     *
     * @param <T> the result type of request execution.
     * @param requestProducer request producer callback.
     * @param responseConsumer response consumer callaback.
     * @param callback future callback.
     * @return future representing pending completion of the operation.
     */
    <T> Future<T> execute(
            HttpAsyncRequestProducer requestProducer,
            HttpAsyncResponseConsumer<T> responseConsumer,
            FutureCallback<T> callback);

but in the Rest client the Future is ignored and void is returned (that's what you pointed out).

So I also think this Future<T> should not be ignored: I think that Optional is not a good idea.
As the API is wrapping all Http types probably also this Future should be wrapped to some internal type.

So it could be something like:

1: expose the Future in the private method here:

https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java#L290

private Future<HttpResponse> performRequestAsync(final NodeTuple<Iterator<Node>> nodeTuple final InternalRequest request, final FailureTrackingResponseListener listener) { ...

  1. have a cancellation token class such as (something like):
    static class CancellationToken {

        private final Future<HttpResponse> response;
        private final Exception cause;

        CancellationToken(Future<HttpResponse> response) {
            this.response = response;
            this.cause = null;
        }

        CancellationToken(Exception cause) {
            this.response = null;
            this.cause = cause;
        }

        public void cancel(boolean mayInterruptIfRunning) throws Exception {
            if (cause!= null){
               throw  RestClient.extractAndWrapCause(cause);
            }
            response.cancel(mayInterruptIfRunning);
        }
    }
  1. wrapping the Future in the Cancellation token

https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java#L280

doing something like

  public CancellationToken performRequestAsync(Request request, ResponseListener responseListener) {
        CancellationToken token = null;
        try {
            FailureTrackingResponseListener failureTrackingResponseListener =
                  new FailureTrackingResponseListener(responseListener);
            InternalRequest internalRequest = new InternalRequest(request);
            Future<HttpResponse> response =  performRequestAsync(nextNodes(), internalRequest, failureTrackingResponseListener);
            token = new CancellationToken(response);
        } catch (Exception e) {
            token = new CancellationToken(e);
            responseListener.onFailure(e);
        }
        return token;
    }

What you think?

A Future could work. The key thing would be to call abort on the request so it releases the file handle and closes the socket client side.

The FailureTrackingResponseListener could take care of this from it's onCancel method but it would need the InternalRequest as a constructor arg.

The CancellationToken looks a bit like a CompletableFuture. Java 9 apparently introduced a CompletableFuture for this purpose: https://stackoverflow.com/questions/49432257/completablefuture-immediate-failure. Given that ES still needs to run on Java 8, it may be necessary to implement something similar to that and switch in the future.

The question is what that Future should expose because we don't want user's to actually interact with the response. One option could be to simplify your CancellationToken to take a CompletableFuture as the only constructor arg and keep that field private.

update it seems CompletableFuture was added in java 8 and has a completeExceptionally method.

Checking a bit the code I understood ( I think I did) that:

  1. the sync calls are actually async calls where it is done "get".
    https://github.com/elastic/elasticsearch/blob/master/client/rest/src/main/java/org/elasticsearch/client/RestClient.java#L214
    Ideally we could abort the call using the method reset on the inner HttpAsyncRequestProducer.
    Given the structure of the method that tries on multiple nodes it looks complicated this change.

  2. The async calls are easier: to abort the call we should cancel the Future: we cannot call abort on the wrapped HttpRequest because it does not support it (it is in the hierarchy used in the synchronous calls).
    So I would wrap the Future in a CancellationToken exposing the cancel.

I forked the repo and I did the change here, so that is veisible and can be evaluated

https://github.com/jesinity/elasticsearch/tree/rest-async-cancellation-token

Heya I am looking at this, as it may also be useful for #43332 where we are exposing automatic cancellation of search tasks whenever the underlying connection gets closed. It would be nice to expose the ability to abort a request using the java client, though there's a couple of steps to get there if we want to expose this to both low-level and high-level clients. Will keep you posted.

@jesinity agreed; this is only relevant for async calls. Calling abort on a synchronous connection would be less relevant; all that is needed there is proper cleanup, which is already taken care off.

I looked at your fork and it seems fine. I also looked at the code exposing the future, we would indeed need to expose the original apache httpclient request in some way in order to be able to cancel it. Instead of the CancelToken, I recommend using a CompletableFuture instead as this seems intended for this use case.

It would be great if we could make our ActionFuture extend CompletableFuture rather than Future, it would allow us not only to use the cancel method, but add composability for users of the HLRC.

I spent some time looking at this, and it was enough to figure that adding the ability to cancel requests is quite a change. I would strive for not breaking backwards compatibility, also exposing it to low-level REST client is one challenge, while exposing it to the high-level REST client is a different one. We have made a design choice when developing the low-level REST client, to not return a future in our async methods, as accepting the listener was preferred.

We've had both listener and future in our transport client api for a long time and many many people found that confusing, see #9201 for the discussion around this topic. Maybe for this change it would make sense to return a future, but I want to raise awareness around the reasons why we have what we have today.

As this is quite an involving change that will require discussions and multiple iterations, I leave this to the core/features team as I don't have bandwidth to work on it.

My original proposal was to simply return something on async methods in both the low level and highlevel rest client where it currently returns void. Whether that's a Future of something or s CancelToken does not matter, as long as it exposes a cancel method that calls the abort method on the underlying request. This is 100% backwards compatible because ignoring the return value is perfectly fine. Doing something with it is optional. It should also be possible to do this such that the end user is shielded from direct access to the apache request or other internal state, which I agree is something that is not desirable.

I agree that the call back architecture in place kind of conflicts with returning a Future. You could feasibly add a cancel method to that but that compromises the design of that in the sense that call back classes probably should not expose logic like that or hold state (like the original request). IMHO returning something where it currently returns void is probably the least invasive and most intuitive.

I actually started doing the PR for this last week before realizing this needs more discussion. IMHO the main thing to decide is what to return instead of void. A CompletableFuture of something (e.g. a boolean indicating success or failure, etc.) or a CancelToken like @jesinity proposed would both work. The important thing is to expose a cancel method that calls abort on the apache request so higher level frameworks integrating elasticsearch async stuff (Koltin with co-routines, rx java, etc.) can do the right thing instead of dealing with needlessly blocked threads.

heya, I had another look at this and chatted to @jimczi about it. We came up with a reasonably easy way to expose the ability to cancel requests in the low-level client. It is based on the cancellation token idea, though I did not go for wrapping the future as the future changes every time we retry a request. Instead, it is sufficient to call abort on the underlying http request, which we reuse throughout the different retries. The only caveat is that we need to make sure that when cancel is called between retries, we still act on it (if we don't do it specifically we may end up calling abort on a request that's about to be sent, which will have no effect).

Like @jillesvangurp mentioned, this is backwards compatible as we only change the return value of performAsyncRequest from void to Cancellable. It is also quite easy to expose in the high-level REST client as all we need to do is return a Cancellable anywhere we call performRequestAsync.

I would prefer not touching ActionListener and not having a specific callback for when a request is cancelled, instead onFailure will be called and a CancellationException will be raised through it.

Could you please have a look at the PR and let me know what you think? #45379

heya @jillesvangurp now that #45379 is in, would you like to open a PR to expose the feature to the high-level REST client? Otherwise I will get to it at some point ;)

@javanna Sure, I'll try to find some time to do this in the next few days. This should be straight forward but there quite a few async methods that need to be updated for this.

sounds good!

@javanna see #45688

I hope I did not miss any calls; there were a lot of them.

Actually did miss some and I broke a test that would have pointed that out to me (nice test!). All fixed now. gradle check is now happy which should mean all end points have been updated.

It would be nice to get this in 7.4.0. I'd like to integrate this into my kotlin client ASAP.

Heya, I have bad news. We found a pretty bad bug in the apache client which
causes the io reactor the crash when cancelling requests. I have been
looking into possible workarounds, but it is very likely that we will back
out the API change for the low-level client till an adequate fix is made.
Will keep you posted.

On Mon, Aug 19, 2019, 11:17 Jilles van Gurp notifications@github.com
wrote:

It would be nice to get this in 7.4.0. I'd like to integrate this into my
kotlin client ASAP.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/elastic/elasticsearch/issues/44802?email_source=notifications&email_token=AAGLHTHQH77YM7SA37BP333QFJQLJA5CNFSM4IGOY272YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4SH5HA#issuecomment-522485404,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGLHTDSFDUIBIJAOH7Z2UDQFJQLJANCNFSM4IGOY27Q
.

OK, that's disappointing. Hopefully this can be resolved quickly on their side.

See #45577 and https://issues.apache.org/jira/browse/HTTPASYNC-152

@javanna one option here is that we simply comment out the call to request.abort for now and get all the rest in until things are fixed in httpclient. From an API point of view having cancel plumbing in place is probably a good thing and I'd hate to have to do that pr for the high level client again as doing that was very tedious.

Thanks for posting the links. I am testing a possible fix as we speak, but it's unlikely that it will make it for 7.4. I haven't yet backported #45379 to 7.x anyways, so there is nothing to back out of the 7.4 release at the moment. I would not be happy with leaving all the cancellation work in but having it no-op due to this bug. Maybe we can though get your PR in anyways on master and only postpone it's backport. Let's wait a couple of days and see how long it takes for a fix to get in and possibly get released.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

clintongormley picture clintongormley  Â·  3Comments

rpalsaxena picture rpalsaxena  Â·  3Comments

dadoonet picture dadoonet  Â·  3Comments

abtpst picture abtpst  Â·  3Comments

ttaranov picture ttaranov  Â·  3Comments