No downtime
≈2% downtime
npm installnpm install -g artillerypm2 --wait-ready --restart-delay 1000 start index.js -i maxwhile true; do pm2 --wait-ready --restart-delay 1000 gracefulReload index; sleep 1; doneartillery quick --duration 60 --rate 10 -n 20 http://localhost:3000Around 2% of request sent by artillery fail with ECONNRESET or ETIMEDOUT, while 100% of request succeed while not graceful reloading.
OS : Mac OS X 10.11
node.js : v7.5.0
PM2 : 2.4.0
2017-02-21 14:47:57: pid=92140 msg=process killed
2017-02-21 14:47:57: Starting execution sequence in -cluster mode- for app name:index id:0
2017-02-21 14:47:57: pid=92141 msg=process killed
2017-02-21 14:47:57: Starting execution sequence in -cluster mode- for app name:index id:1
2017-02-21 14:47:57: App name:index id:1 online
2017-02-21 14:47:57: (node:83735) DeprecationWarning: worker.suicide is deprecated. Please use worker.exitedAfterDisconnect.
2017-02-21 14:47:57: Stopping app:index id:2
2017-02-21 14:47:57: App name:index id:0 online
2017-02-21 14:47:57: (node:83735) DeprecationWarning: worker.suicide is deprecated. Please use worker.exitedAfterDisconnect.
2017-02-21 14:47:57: Stopping app:index id:3
2017-02-21 14:47:57: App name:index id:2 disconnected
2017-02-21 14:47:57: App [index] with id [2] and pid [92144], exited with code [0] via signal [SIGINT]
2017-02-21 14:47:57: App name:index id:3 disconnected
2017-02-21 14:47:57: App [index] with id [3] and pid [92145], exited with code [0] via signal [SIGINT]
2017-02-21 14:47:57: pid=92144 msg=process killed
2017-02-21 14:47:57: Starting execution sequence in -cluster mode- for app name:index id:2
2017-02-21 14:47:57: pid=92145 msg=process killed
2017-02-21 14:47:57: Starting execution sequence in -cluster mode- for app name:index id:3
2017-02-21 14:47:57: App name:index id:2 online
2017-02-21 14:47:57: (node:83735) DeprecationWarning: worker.suicide is deprecated. Please use worker.exitedAfterDisconnect.
2017-02-21 14:47:57: Stopping app:index id:4
2017-02-21 14:47:57: App name:index id:3 online
2017-02-21 14:47:57: (node:83735) DeprecationWarning: worker.suicide is deprecated. Please use worker.exitedAfterDisconnect.
2017-02-21 14:47:57: Stopping app:index id:5
2017-02-21 14:47:57: App name:index id:4 disconnected
2017-02-21 14:47:57: App [index] with id [4] and pid [92148], exited with code [0] via signal [SIGINT]
2017-02-21 14:47:57: App name:index id:5 disconnected
2017-02-21 14:47:57: App [index] with id [5] and pid [92149], exited with code [0] via signal [SIGINT]
2017-02-21 14:47:57: pid=92148 msg=process killed
2017-02-21 14:47:57: Starting execution sequence in -cluster mode- for app name:index id:4
2017-02-21 14:47:57: pid=92149 msg=process killed
2017-02-21 14:47:57: Starting execution sequence in -cluster mode- for app name:index id:5
2017-02-21 14:47:57: App name:index id:4 online
2017-02-21 14:47:57: (node:83735) DeprecationWarning: worker.suicide is deprecated. Please use worker.exitedAfterDisconnect.
2017-02-21 14:47:57: Stopping app:index id:6
2017-02-21 14:47:57: App name:index id:5 online
2017-02-21 14:47:57: (node:83735) DeprecationWarning: worker.suicide is deprecated. Please use worker.exitedAfterDisconnect.
2017-02-21 14:47:57: Stopping app:index id:7
2017-02-21 14:47:57: App name:index id:6 disconnected
2017-02-21 14:47:57: App [index] with id [6] and pid [92152], exited with code [0] via signal [SIGINT]
2017-02-21 14:47:57: App name:index id:7 disconnected
2017-02-21 14:47:57: App [index] with id [7] and pid [92153], exited with code [0] via signal [SIGINT]
2017-02-21 14:47:57: pid=92152 msg=process killed
2017-02-21 14:47:57: Starting execution sequence in -cluster mode- for app name:index id:6
2017-02-21 14:47:57: pid=92153 msg=process killed
2017-02-21 14:47:57: Starting execution sequence in -cluster mode- for app name:index id:7
2017-02-21 14:47:57: App name:index id:6 online
2017-02-21 14:47:57: (node:83735) DeprecationWarning: worker.suicide is deprecated. Please use worker.exitedAfterDisconnect.
2017-02-21 14:47:57: App name:index id:7 online
2017-02-21 14:47:57: (node:83735) DeprecationWarning: worker.suicide is deprecated. Please use worker.exitedAfterDisconnect.
cc @vmarchaud
Something to start with is that you don't have the good exit strategy (see my code example in #2573), you need to listen for SIGINT to be able to ask to the http server to close the connections.
PS : Next time please make search before opening an issue please (it avoid spending time to re-answer to question that are already answered), thanks !
Ah good spot! I've added a listen on SIGINT and rerun commands, but I'm still seeing around 2% downtime on the example app.
This seems to be much higher (around 36% downtime) on a larger/slower apps that makes API calls and are running over HTTPS etc.
You need to take care of PM2_LISTEN_TIMEOUT for slower apps, please read the whole #2573 issue.
You will read that ECONRESET are inherent of the TCP stack implementation and we can't control them with nodejs. The main metric for downtine are the read/write socket error that must be at 0 with reloading.
@vmarchaud thanks for that! 👍 Setting PM2_GRACEFUL_TIMEOUT and PM2_KILL_TIMEOUT has brought down the chance of getting an error to 0.8%, which I think is like you say to do with the TCP stack.