Graphql-engine: Event trigger webhooks are not runned concurrently

Created on 24 Jun 2020  路  5Comments  路  Source: hasura/graphql-engine

I have insert event in table. It call webhook and response is in 10s. Actually webhooks are called sequential. But I want to call it concurrently. How to do it?

I am not sure if this behavior is bug in Hasura/heroku or some Hasura feature and how to change it.

Steps to reproduce:

  1. Start webhook https://github.com/MichalKalita/test-delayed-webhook/ or use deployed version on heroku https://protected-castle-19380.herokuapp.com/ - this webhook just wait 10 seconds and response with status ok
  2. Setup insert event trigger to webhook
  3. Insert multiple rows to trigger webhook.
  4. Looks to pending events in hasura console

Actual behavior:
After 10s 1 pending event is done. It is not runningconcurrently.

Expected behavior:
Run all webhooks in same time. All should be done in 10 seconds

Can be reproduced on v1.3.0-beta.1, v1.3.0-beta.2

It cannot be reproduced on v1.2.2

server bug

Most helpful comment

@MichalKalita @mousetraps Thank you so much for reporting this regression.
This bug was introduced from v1.3.0-beta.1 as you folks point out due to a faulty refactor of the code.
I have raised a PR which will fix this issue.

All 5 comments

We're running into this issue as well on the latest builds (including v1.3.0-beta.4). It's crushing performance and it's a blocking issue for us, as our application counts on events to be processed in a timely manner. We weren't running into it before, so it seems like a regression.

Hasura is running in a docker container, so it would be surprising for this to be a platform issue, but let me know if you need more specifics about our environment.

@MichalKalita @mousetraps Thank you so much for reporting this regression.
This bug was introduced from v1.3.0-beta.1 as you folks point out due to a faulty refactor of the code.
I have raised a PR which will fix this issue.

@codingkarthik I've seen the PR (https://github.com/hasura/graphql-engine/pull/5352) where you solve this regression. We're currently experiencing a bunch of webhook executions on production (on version 1.3.0), and I'm wondering if it is related to this.

We're seeing a lot errors like this one and I'm wondering what could be causing it:

2020-11-14 22:00:49 UTC:xxxxx.compute-1.amazonaws.com(34640):xxx@xxx:[14246]:ERROR: could not serialize access due to concurrent update
2020-11-14 22:00:49 UTC:xxxxx.compute-1.amazonaws.com(34640):xxx@xxx:[14246]:STATEMENT:
UPDATE hdb_catalog.event_log
SET locked = 't'
WHERE id IN ( SELECT l.id
FROM hdb_catalog.event_log l
WHERE l.delivered = 'f' and l.error = 'f' and l.locked = 'f'
and (l.next_retry_at is NULL or l.next_retry_at <= now())
and l.archived = 'f'
ORDER BY created_at
LIMIT $1
FOR UPDATE SKIP LOCKED )
RETURNING id, schema_name, table_name, trigger_name, payload::json, tries, created_at
  1. Does Hasura has a limit of number of webhook executions it can take concurrently?
  2. As it seems to be throwing out an issue while locking a trigger, does it mean there is more than one Hasura instance trying to lock to the same webhook at the same time?

Note: Our Hasura setup is running multiple instances at the same time to provide HA.

@cusspvz The issue you're facing is a different issue, which happens when you're running event triggers on multiple instances of hasura. We have fixed this in v1.3.3 which has been released yesterday. Do let us know if you continue facing the same issue with the newer version also.

@codingkarthik I will update it and let you know. Thanks for that quick answer @codingkarthik !

Was this page helpful?
0 / 5 - 0 ratings