When serializing a large response the event loop can be blocked for a long time. Ideally in cases like this the response could be constructed in an async-friendly manner so that other operations can continue.
In particular, when using apollo tracing the response payload sizes go up quite dramatically because it includes a ton of information to use for profiling purposes.
I'd be happy to re-open this issue with some additional details, but there's nothing super actionable at it at the moment (and it actually touches on two different separate issues, so it might be good to break those out separately if this is still a concern).
One solution for this might be @defer support, which originally surfaced in #1287. I'll close this, but I'm happy to re-open if you can provide some additional details, particularly any which demonstrate what precisely is blocking. Thanks for opening this originally!
Hi @abernix
From time to time our query payloads were >1MB of JSON, which takes a fair amount of time to convert to a string using JSON.stringify. This can block the process for a fair amount of time. In those cases it would be nice if somehow the code just walked the tree and generate and emitted the JSON in chunks.
One could also imagine the possibility of streaming out the JSON in chunks as it was resolved by the resolvers. So the initial JSON { can be written, then as fields are resolved they results can be written out in the order they arrive.
However, there's a reasonable argument that this is too much complexity for the GraphQL stuff to handle. We can use some other transport for large chunks of data or use pagination. After all, when the client receives the giant payload it, too, will pause for a long time parsing it.
It seems better to find ways to avoid the large payloads altogether, which is what we have been doing.
So, this issue is not likely a big priority for you guys.
We are also suffering from this.
We have an use-case that the user needs to export a lot of JSON data(multiple thousands of objects), served by the GraphQL API.
It is totally acceptable that this query takes a long time to complete, but we didn't expect for this large response to block the entire event loop of our node server.
Worse than blocking everything, it also blocks the container's health check endpoint(defined in onHealthCheck function of apollo server .applyMiddleware function). So when the container is handling this large response, it also makes the health check endpoint time out, and restart the container.
I know I could solve this issue, by raising the health check timeout, but really, the culprit is this blocking operation being done on large responses. If only we could stream the response(you said @defer, could work, is that right? @abernix)
Most helpful comment
Hi @abernix
From time to time our query payloads were >1MB of JSON, which takes a fair amount of time to convert to a string using
JSON.stringify. This can block the process for a fair amount of time. In those cases it would be nice if somehow the code just walked the tree and generate and emitted the JSON in chunks.One could also imagine the possibility of streaming out the JSON in chunks as it was resolved by the resolvers. So the initial JSON
{can be written, then as fields are resolved they results can be written out in the order they arrive.However, there's a reasonable argument that this is too much complexity for the GraphQL stuff to handle. We can use some other transport for large chunks of data or use pagination. After all, when the client receives the giant payload it, too, will pause for a long time parsing it.
It seems better to find ways to avoid the large payloads altogether, which is what we have been doing.
So, this issue is not likely a big priority for you guys.