Lisk-sdk: Node should terminate when Postgres server stops

Created on 23 Mar 2018  路  5Comments  路  Source: LiskHQ/lisk-sdk

Expected behavior

Node must die if it realises PSQL server has stopped.

Actual behavior

Node continues alive after PSQL stops. Then it recovers properly when PSQL server restart.
Also the node's stills listening API calls as posting transaction. Fortunately, it produces 409 conflict error.

Steps to reproduce

bash lisk.sh stop_db

Which version(s) does this affect? (Environment, OS, etc...)

v1.0.0-alpha1

bug

All 5 comments

I believe node instance should not die, instead it switch to failover behaviour if any dependent system level service stop working. e.g. If redis stop working, it should just stop using cache. If postgres stop, it should go to failover mode, that it stop producing API responses, but it still accept transactions and what for Postgres to come back online.

Its not a thought its a standard behavior I fond across many many applications in different domains and
technologies. Applications don't die on failure of any dependent services.

@nazarhussain makes a good point; we should put the Node.js app in some kind of suspended/failover state until Postgres comes back up - Otherwise the node will keep rebooting while Postgres is down and clients/peers won't know what's happening.

Event error makes it possible. You can check the error code, and see if it represents a connectivity error, and then do something about:

  • You can kill the process right away
  • You can trigger internal reconnection attempts, and if unsuccessful, then kill the process.

What strategy do we want? If we can formulate a clear strategy, I can implement it.

It would be nice to look into #1788 before this one.

You can trigger internal reconnection attempts, and if unsuccessful, then kill the process

I think that sounds like the best/simplest approach given our time constraints. Adding a failover state to answer HTTP requests with a 500 and freezing transaction processing is more work and will probably break many tests.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

MaciejBaj picture MaciejBaj  路  4Comments

MaciejBaj picture MaciejBaj  路  3Comments

slaweet picture slaweet  路  3Comments

Nazgolze picture Nazgolze  路  3Comments

toschdev picture toschdev  路  3Comments