Lisk-sdk: Handle application and process restart

Created on 2 Apr 2019  路  3Comments  路  Source: LiskHQ/lisk-sdk

Expected behavior

Whenever a process is restarted it should be handled without failures. Also, throw an error for failing to handle the process.

Actual behavior

[inf] 2019-04-02 06:28:00 | Verify->verifyBlock succeeded for block 18359457909549872079 at height 7903574.
[inf] 2019-04-02 06:28:03 | Releasing enqueued broadcasts
[inf] 2019-04-02 06:28:03 | Queue empty
[inf] 2019-04-02 06:28:08 | Releasing enqueued broadcasts
[inf] 2019-04-02 06:28:08 | Queue empty
[inf] 2019-04-02 06:28:09 | Cleanup controller...
[inf] 2019-04-02 06:28:09 | Cleaning chain...
[inf] 2019-04-02 06:28:09 | Disconnecting
[inf] 2019-04-02 06:28:09 | Cleaning HTTP API...
[inf] 2019-04-02 06:28:09 | Disconnecting
[inf] 2019-04-02 06:28:37 | Starting the app - testnet-7000 - Lisk Framework(0.1.0)
[inf] 2019-04-02 06:28:37 | Initializing controller
[inf] 2019-04-02 06:28:37 | Loading controller
[ERR] 2019-04-02 06:28:37 | An instance of application "testnet-7000" is already running. You have to change application name to run another instance.
[inf] 2019-04-02 06:28:54 | Starting the app - testnet-7000 - Lisk Framework(0.1.0)
[inf] 2019-04-02 06:28:54 | Initializing controller
[inf] 2019-04-02 06:28:54 | Loading controller
[ERR] 2019-04-02 06:28:54 | An instance of application "testnet-7000" is already running. You have to change application name to run another instance.
[inf] 2019-04-02 06:29:12 | Starting the app - testnet-7000 - Lisk Framework(0.1.0)
[inf] 2019-04-02 06:29:12 | Initializing controller

Process failed and it wasn't handled properly.

Steps to reproduce

This happened on a upgraded delegate on testnet, cause of issue is known, as there were no logs

Which version(s) does this affect? (Environment, OS, etc...)

1.6.0, Docker

bug

Most helpful comment

The other workaround would be to add the pid tag in the log message to identify if log entries were done from different app instances. That will be a breaking change in logs, so to be done in major release.

All 3 comments

This feature is working fine as per expected the issue just with the file logging.

When we run the other instance of the app and see the console logs we clear and handled error response.

[inf] 2019-04-04 09:17:36 | Starting the app - devnet-4000 - Lisk Framework(0.1.0)
[inf] 2019-04-04 09:17:36 | Initializing controller
[inf] 2019-04-04 09:17:36 | Loading controller
[ERR] 2019-04-04 09:17:37 | An instance of application "devnet-4000" is already running. You have to change application name to run another instance.
[ERR] 2019-04-04 09:17:37 | App stopped with error - Duplicate app instance for "devnet-4000"
[inf] 2019-04-04 09:17:37 | Cleanup controller...
[ERR] 2019-04-04 09:17:37 | Caused error during cleanup - {}

But because each started instance of the app use the same log file, so messages from different instances mixed up.

[dbg] 2019-04-04 09:17:36 | Block already forged for the current slot
[inf] 2019-04-04 09:17:37 | Releasing enqueued broadcasts
[inf] 2019-04-04 09:17:37 | Queue empty
[ERR] 2019-04-04 09:17:37 | An instance of application "devnet-4000" is already running. You have to change application name to run another instance.
[dbg] 2019-04-04 09:17:37 | Broadhash consensus: 0 %
[dbg] 2019-04-04 09:17:37 | Transaction pool size: 0
[dbg] 2019-04-04 09:17:37 | Block already forged for the current slot

The only thing that need to be debugged is one missing error log.

https://github.com/LiskHQ/lisk/blob/f2fce700c164e915621071e7726e40eb3c460594/src/index.js#L45-L51

This error message is logged in console but missed in the file log we need to investigate it further.

The other workaround would be to add the pid tag in the log message to identify if log entries were done from different app instances. That will be a breaking change in logs, so to be done in major release.

Possible solution by adding a check to PID number which needs to be different when restarting:

    async _validatePidFile() {
        const pidPath = `${this.config.dirs.pids}/controller.pid`;
        const pidExists = await fs.pathExists(pidPath);
        if (pidExists) {
            const pid = parseInt(await fs.readFile(pidPath));
            const pidRunning = await isPidRunning(pid);

            this.logger.info(`Old PID: ${pid}`);
            this.logger.info(`Current PID: ${process.pid}`);

            if (pidRunning && pid !== process.pid) {
                this.logger.error(
                    `An instance of application "${
                        this.appLabel
                    }" is already running. You have to change application name to run another instance.`
                );
                throw new DuplicateAppInstanceError(this.appLabel, pidPath);
            }
        }
        await fs.writeFile(pidPath, process.pid);
    }
Was this page helpful?
0 / 5 - 0 ratings

Related issues

Nazgolze picture Nazgolze  路  3Comments

Isabello picture Isabello  路  4Comments

karek314 picture karek314  路  3Comments

yatki picture yatki  路  3Comments

ManuGowda picture ManuGowda  路  3Comments