Graphql-engine: Hasura Performance runtime params

Created on 15 Aug 2019 · 9Comments · Source: hasura/graphql-engine

We are planning to use Hasura Graphql in a new project we have just started working on and for that have been doing a poc. As part of poc we noticed that when the number of users/sockets reach around 200 the subscriptions are taking longer time. Are there any runtime params I need to pass to improve it? Also, Interested in knowing if there is a way to specify

number of db connections in postgres pool
start and max memory size hasura
or anything ?

POC :

Hasura : v1.0.0-beta.4, Unix Box A

DB: Unix Box B
Postgres DB - Just 1 table with 28K records
Updating timestamp for all rows every second
Only primary key index

Client: Windows Box C
Spring websocket client
Rampup period add 1 client every 2 seconds
Each client has subscription with a different offset on table

ping time box C-> A 1 ms
ping time box C -> B 1 ms
ping time box B -> A 1ms

Result:
Average time to receive update : 2-6 seconds

question

Source

vikastomar5983

Most helpful comment

@rrjanbiah not to step in the Hasura team’s toes, but I worked around this sort of problem with great results. Worth noting that many data has a created_at column with millisecond accuracy.

Basically, my subscription is set to limit the results to 1, sorting by created_at, and only returns the record’s id and created_at fields. The code client-side keeps track of the last record it has received’s created_at, and, upon getting an update from that subscription, executes a query for all records in between the previous created_at and the recently-received created_at.

To help with the potential performance hit of rapidly firing queries off like this, I modified our Apollo client’s websocket link to detect that specific query and route it over the websocket connection instead of it normally being passed over to an HTTP POST.

This is used for synchronizing points for drawing things in real-time to multiple observing clients. It’s performance is pretty darn good.

brandonpapworth on 4 Jan 2020

👍5

All 9 comments

@vikastomar5983 What's the subscription that you are using in the benchmark? And how are the arguments varied?

0x777 on 16 Aug 2019

We are using the below setup to run the test (this is almost similar to one descibed here https://github.com/hasura/graphql-engine/blob/master/architecture/live-queries.md#testing)

Starting Hasura Graph QL engine with
docker run -d -p ****:**** -e HASURA_GRAPHQL_DATABASE_URL=postgres://***** -e HASURA_GRAPHQL_ENABLE_CONSOLE=true -e HASURA_GRAPHQL_LIVE_QUERIES_FALLBACK_REFETCH_INTERVAL=50 -e HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_REFETCH_INTERVAL=50 -e HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_BATCH_SIZE=1000 -e HASURA_GRAPHQL_PG_CONNECTIONS=200 -e HASURA_GRAPHQL_PG_STRIPES=3 /hasura/graphql-engine:v1.0.0-beta.4

Creating upto 200 GraphQL live-query clients with spring websocket (org.springframework.web.socket.WebSocketSession)
Each webscocket client is subscribing to a different query on myTable : subscription { myTable(limit: 40, offset: **RANDOM_FOR_EVERY_SOCKET** ) { update_date_time } }
myTable has 28K records and index on primary key
There is a script which updates all the rows in myTable every 1 second
after test finishes, a script is run to check the average latency at which websockets receive update events - it is greater than 1 second and for many updates it is upto 1 mins
Another script is run to check if all the udpates done to myTable were received as events by websocket - missing 2% of the updates

vikastomar5983 on 19 Aug 2019

👍1

Hi @vikastomar5983 thanks for the additional context.

graphql-engine optimises subscriptions if you use variables in the subscriptions. You'll need to rewrite this subscription

subscription s {
  myTable(limit: 40, offset: **RANDOM_FOR_EVERY_SOCKET** ) {
    update_date_time
  }
}

as follows:

subscription s($random_id: Int!) {
  myTable(limit: 40, where: {id:{_gt: $random_id}}, order_by: {id: desc}) {
    update_date_time
  }
}

The changes are

Use variables to change the parameters of a subscription (instead of creating a subscription with values embedded in it). Such subscriptions are very efficient with the current implementation.
In almost all cases offset shouldn't be used for pagination. Instead a where clause like should be used.

About the runtime params:

HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_REFETCH_INTERVAL=50. Such a low value generates a lot of traffic to postgres (the default is 1000).
HASURA_GRAPHQL_LIVE_QUERIES_MULTIPLEXED_BATCH_SIZE=1000. This is large. Use such a large value only if your Postgres instance is beefy. The default of 100 is good enough.

After the above rewrite of your subscription, start your benchmark with ..MULTIPLEXED_REFETCH_INTERVAL set to 1000 and lower it to 500 then to 200 and then maybe to 100 if you don't find the latencies acceptable.

Another script is run to check if all the udpates done to myTable were received as events by websocket - missing 2% of the updates

graphql-engine does not guarantee that all the updates are propagated. But setting the refetch interval to a lower value will ensure that the events are less likely to be missed.

Let us know how it goes after the above changes.

0x777 on 20 Aug 2019

@vikastomar5983 is your problem solved? If so, we can close the issue 🙂

marionschleifer on 27 Aug 2019

@marionschleifer With above changes to the query the performance is much much better. But we are still missing the in between updates to the data when the updates are frequent.

vikastomar5983 on 27 Aug 2019

❤1 👍1

@vikastomar5983 Do you find any better workaround? This is a serious blocker for us now.

rrjanbiah on 8 Oct 2019

@rrjanbiah not to step in the Hasura team’s toes, but I worked around this sort of problem with great results. Worth noting that many data has a created_at column with millisecond accuracy.

This is used for synchronizing points for drawing things in real-time to multiple observing clients. It’s performance is pretty darn good.

brandonpapworth on 4 Jan 2020

👍5

@brandonpapworth Thanks for sharing, much appreciated!

rrjanbiah on 7 Jan 2020

❤1 😄1

Basically, my subscription is set to limit the results to 1, sorting by created_at, and only returns the record’s id and created_at fields. The code client-side keeps track of the last record it has received’s created_at, and, upon getting an update from that subscription, executes a query for all records in between the previous created_at and the recently-received created_at.

I have the same issue (#3517) and this is also the workaround I am thinking about. When I get something from the subscription I only consider it as a "ping" and do an actual query to find the updates since the last record I received.

This may worth putting such a function into an npm library on its own.