Graphql-ruby: First-class async support

Created on 23 Sep 2016  Â·  11Comments  Â·  Source: rmosolgo/graphql-ruby

It would be nice to provide first-class support to asynchronous resolvers. It's true, CRuby can't do too much in parallel, but what it _can_ do (I/O) might be worth supporting.

168 (DeferredExecution) may provide a platform for this: the execution strategy maintains a queue of "threads" and resolves them one at a time. Perhaps resolvers could signal to graphql-ruby that they're "working", and graphql-ruby could push the thread to the end of the queue. Then, when it's done with other things, it could revisit that thread and see if it's ready (and, if it's the only thread left, wait for it).

I'd like to support this behavior in a general way without requiring any dependencies. I know graphql-batch uses promise.rb, and personally, I'm interested in concurrent-ruby's Future.

I can imagine general support working like this:

  • You provide a GraphQL-friendly wrapper around the async "primitive" of your choice. Perhaps it's an object that responds to .ready?, .value and .wait.
  • You tell GraphQL that the resolve function will return an async response, eg

``` ruby
field :remote_call, RemoteThingType, async: true do
resolve -> (obj, args, ctx) { fetch_from_http(obj.thing_id) }
end

# or ...

field :remote_call, RemoteThingType do
resolve_async -> (obj, args, ctx) { fetch_from_http(obj.thing_id) }
end

# or ...

field :remote_call, RemoteThingType do
resolve -> (obj, args, ctx) { GraphQL.async(fetch_from_http(obj.thing_id)) }
end

# or ...

field :remote_call, RemoteThingType do
resolve GraphQL.async(-> (obj, args, ctx) { fetch_from_http(obj.thing_id) })
end
```

Another important element to this is that it should also support asynchronous connection resolvers. In that case, GraphQL "messes with" the resolver you give it, so we should be careful to support that case too.

What do you think, would this kind of support be valuable in the gem? Are there other factors that should be considered?

Most helpful comment

A big shortcoming in using DeferredExecution "out of the box" is that, if you have 3 busy threads in the queue, and you call "wait" on the first one in the queue, you might leave some of them sitting there, idle, if they actually finish before the one you're waiting for.

I think that could be addressed if the internals used promises, since each promise could be "thenned" to carry on with the rest of the fields.

All 11 comments

cc @dylanahsmith what I was referring to the other day.

Absolutely. I would even say that this is a high priority "Issue".

Please don't try to roll your own concurrency code though. I think that this is one of those areas where it is better to pull in a dependency. concurrent-ruby is a good choice.

:heart: I would love to get rid of the monkey patching in graphql-batch.

Perhaps it's an object that responds to .ready?, .value and .wait.

It would be much better if graphql-ruby didn't use something like .wait at all. I realize that graphql-batch calls .sync, but really that was because I didn't want to add more monkey patching for a purely async use case that we didn't need. However, being able to get a promise result from the query would be better, since it could support the event driven use case. This would also match the way graphql-js works.

the execution strategy maintains a queue of "threads" and resolves them one at a time. Perhaps resolvers could signal to graphql-ruby that they're "working", and graphql-ruby could push the thread to the end of the queue. Then, when it's done with other things, it could revisit that thread and see if it's ready (and, if it's the only thread left, wait for it).

By just providing support for promises on results, then graphql-ruby can just chain the promise result and not have to bake in how the promise is going to be resolved.

I'd like to support this behavior in a general way without requiring any dependencies. I know graphql-batch uses promise.rb, and personally, I'm interested in concurrent-ruby's Future.

I chose promise.rb because it wasn't tied to a threaded or event-driven solution, so we could leverage it for batch loading even in a single threaded implementation. However, I would be fine with duck typing to avoid a dependance on that and/or an adapter to make it work with whatever strategy is used in graphql-ruby.

don't try to roll your own concurrency code ... pull in a dependency

I definitely agree that I don't want a maintenance nightmare and I don't want to get in over my head!

@staugaard , are you familiar enough with concurrent-ruby to imagine how it would work for GraphQL users (for example, a little Ruby example, or a short description of the parts)? I'm curious what that would look like from our perspective.

(I'm not familiar enough with it to do that. My main experience with this stuff in the past was a basic Celluloid-backed approach that pushed everything into a pool, then waited for all of them to be done.)

support the event-driven use case

@dylanahsmith can you elaborate on that a bit?

Admittedly (perhaps embarrassingly), I'm only used to this flow:

  • Client sends HTTP request to Rails
  • Rails makes the client wait while it does stuff
  • Rails sends the response all at once†

In light of that, my main goal is to reduce the amount of time that we make the client wait. In my own case, memory, CPU and Unicorns are basically unlimited (at least they're not the limiting factor).

Can you help me understand where event-driven programming fits in? Is there another client-server flow that it supports, or some other benefit?


† Ok, there's also ActionController::Live for sending responses bit-by-bit, but DeferredExecution already supports that

A big shortcoming in using DeferredExecution "out of the box" is that, if you have 3 busy threads in the queue, and you call "wait" on the first one in the queue, you might leave some of them sitting there, idle, if they actually finish before the one you're waiting for.

I think that could be addressed if the internals used promises, since each promise could be "thenned" to carry on with the rest of the fields.

I noticed Sangria pushed new docs on "Deferred value resolution":

sangria-graphql.org/learn/#deferred-value-resolution

It made me realize that I'm getting two _distinct_ things mixed up in my head:

  • Specifying batched & nested data dependencies (which is what "Deferred value resolution" does, and I think it's also the main use case for graphql-batch)
  • Moving IO to a background thread (relevant for wrapping some high-latency external service, I'm not sure if existing tools support this)

Am I right to think that better support for _dependencies_ is more pressing than threaded value resolution?

Yup, we just care about batching, so that is graphql-batch's main use case. I haven't heard of anyone trying to use it with a parallel or concurrent executor.

The event-driven use case I was talking about would be about using eventmachine and not rails (definitely not unicorn), and would be about better utilizing the machine to provide more throughput. That could reduce latency by being able to processing multiple requests in parallel in the same process rather than having queuing delays from processing them sequentially. Again, no one has actually requested something like that, it would just be easy to support once promises are used internally and would avoid having to know how to wait for the promises/futures to resolve.

Also, instead of having to have the objects respond to certain methods, you could have an async adapter that provides the necessary interface with a default adapter that doesn't everything synchronously. A minimal API for the adapter would allow for chaining promises and rescuing errors. Using an adapter like this would avoid the need to wrap objects returned from resolve procs to conform to the interface you are proposing.

A good example of where it would make sense to use eventmachine is when writing a graphql server that acts a proxy to different backend or external services. In that case you don't want a process or thread tied up when waiting for a result to come back from a slow service, so making the whole query asynchronously would allow more requests to be processed in parallel from the same thread/process.

For example, graphql-js provides this type of asynchronous API by returning a promise as the result of the query. This wasn't the approach taken with graphql-batch mostly because the execution strategy API was expected to return a synchronous result.

A basic implementation of lazy resolution was implemented in #386 (it's based heavily on GraphQL::Batch::ExecutionStrategy). It could serve as a platform for a dead-simple batch loader or for graphql-batch

Still a few things to do before the next release:

  • Add instrumentation hooks for lazy resolution
  • Look for performance improvements (perhaps #395, something like it)

I'm not sure what to do with instrumentation. I'm going to punt. _One_ of the lazy resolves will do a bunch of work, but you can't tell which one, and you can't tell which fields "benefitted" from the work. I think the loaders themselves have to do report their timing etc. (I think this is the conclusion at https://github.com/Shopify/graphql-batch/issues/27 too.)

There are a few more issues to sort out in graphql-batch, but this basic API will be in 1.3.0.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

rmosolgo picture rmosolgo  Â·  4Comments

theodorton picture theodorton  Â·  3Comments

dsgoers picture dsgoers  Â·  3Comments

skanev picture skanev  Â·  3Comments

crice88 picture crice88  Â·  3Comments