Egg: 按文档进行线上部署一直报这个错,怎么解决?

Created on 22 Jan 2018  ·  44Comments  ·  Source: eggjs/egg

[egg-scripts] Wait Start: 1...
[egg-scripts] Wait Start: 2...
[egg-scripts] Wait Start: 3...
[egg-scripts] Wait Start: 4...
[egg-scripts] Wait Start: 5...
[egg-scripts] Wait Start: 6...
[egg-scripts] Wait Start: 7...
[egg-scripts] Wait Start: 8...
[egg-scripts] Wait Start: 9...
[egg-scripts] Wait Start: 10...
[egg-scripts] Wait Start: 11...
[egg-scripts] Wait Start: 12...
[egg-scripts] Wait Start: 13...
[egg-scripts] Wait Start: 14...
[egg-scripts] Wait Start: 15...
[egg-scripts] Wait Start: 16...
[egg-scripts] Wait Start: 17...
[egg-scripts] Wait Start: 18...
[egg-scripts] Wait Start: 19...
[egg-scripts] Wait Start: 20...
[egg-scripts] Wait Start: 21...
[egg-scripts] Wait Start: 22...
[egg-scripts] Wait Start: 23...
[egg-scripts] Wait Start: 24...
[egg-scripts] Wait Start: 25...
[egg-scripts] Wait Start: 26...
[egg-scripts] Wait Start: 27...
[egg-scripts] Wait Start: 28...
[egg-scripts] Wait Start: 29...
[egg-scripts] Wait Start: 30...
[egg-scripts] Wait Start: 31...
[egg-scripts] Wait Start: 32...
[egg-scripts] Wait Start: 33...
[egg-scripts] Wait Start: 34...
[egg-scripts] Wait Start: 35...
[egg-scripts] 2018-01-22 11:43:51,078 ERROR 838 [-/127.0.0.1/-/0ms GET /] nodejs.Error: [ClusterClient] leader does not be active in 30000ms on port:41217
[egg-scripts] at Function.waitFor (/website/production/source/node_modules/cluster-client/lib/server.js:239:15)
[egg-scripts] at waitFor.next ()
[egg-scripts] at onFulfilled (/website/production/source/node_modules/co/index.js:65:19)
[egg-scripts] at
[egg-scripts] at process._tickCallback (internal/process/next_tick.js:188:7)

question

Most helpful comment

iptables 用得不多,不过我觉得你这个 policy 没有允许 loopbackINPUT 有点多余了。试试下面的配置。

-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A OUTPUT -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp --dport 443 -j ACCEPT
-A INPUT -p tcp --dport 80 -j ACCEPT
-A INPUT -p tcp --dport 8081 -j ACCEPT
-A INPUT -p tcp -m state --state NEW --dport 39999 -j ACCEPT
-A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT
-A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables denied:" --log-level 7
-A INPUT -p tcp --dport 80 -i eth0 -m state --state NEW -m recent --set
-A INPUT -p tcp --dport 80 -i eth0 -m state --state NEW -m recent --update --seconds 60 --hitcount 150 -j DROP
-A INPUT -j REJECT
-A FORWARD -j REJECT
COMMIT

All 44 comments

启动超时,是不是网络不通导致某些插件(如 mysql)或者你自己的 beforeStart 一直没返回

完整的报错信息如下,在本地测试npm run dev 没问题,部署到线上的时候npm start一直不行,而且很奇怪为什么会报端口:44898的错误,我侦听的是3000端口,我的防火墙只允许测试的3000端口通过

2018-01-22 11:10:08,950 ERROR 458 [-/127.0.0.1/-/1ms GET /] nodejs.Error: [ClusterClient] leader does not be active in 30000ms on port:44898
[egg-scripts] at Function.waitFor (/website/production/source/node_modules/cluster-client/lib/server.js:239:15)
[egg-scripts] at waitFor.next ()
[egg-scripts] at onFulfilled (/website/production/source/node_modules/co/index.js:65:19)
[egg-scripts] at
[egg-scripts] at process._tickCallback (internal/process/next_tick.js:188:7)
[egg-scripts]
[egg-scripts] pid: 458
[egg-scripts]
[egg-scripts] 2018-01-22 11:10:08,951 ERROR 458 nodejs.Error: [ClusterClient] leader does not be active in 30000ms on port:44898
[egg-scripts] at Function.waitFor (/website/production/source/node_modules/cluster-client/lib/server.js:239:15)
[egg-scripts] at waitFor.next ()
[egg-scripts] at onFulfilled (/website/production/source/node_modules/co/index.js:65:19)
[egg-scripts] at
[egg-scripts] at process._tickCallback (internal/process/next_tick.js:188:7)
[egg-scripts]
[egg-scripts] pid: 458
[egg-scripts]
[egg-scripts] 2018-01-22 11:10:08,951 ERROR 458 [app_worker] start error, exiting with code:1
[egg-scripts] 2018-01-22 11:10:08,951 ERROR 458 [app_worker] exit with code:1
[egg-scripts] 2018-01-22 11:10:08,956 ERROR 442 nodejs.AppWorkerDiedError: [master] app_worker#1:458 died (code: 1, signal: null, suicide: false, state: dead), current workers: []
[egg-scripts] at Master.onAppExit (/www/website/production/source/node_modules/egg-cluster/lib/master.js:384:21)
[egg-scripts] at emitOne (events.js:115:13)
[egg-scripts] at Master.emit (events.js:210:7)
[egg-scripts] at Messenger.sendToMaster (/website/production/source/node_modules/egg-cluster/lib/utils/messenger.js:122:17)
[egg-scripts] at Messenger.send (/website/production/source/node_modules/egg-cluster/lib/utils/messenger.js:87:12)
[egg-scripts] at EventEmitter.cluster.on (/website/production/source/node_modules/egg-cluster/lib/master.js:263:22)
[egg-scripts] at emitThree (events.js:140:20)
[egg-scripts] at EventEmitter.emit (events.js:216:7)
[egg-scripts] at ChildProcess.worker.process.once (internal/cluster/master.js:185:13)
[egg-scripts] at Object.onceWrapper (events.js:318:30)
[egg-scripts] name: 'AppWorkerDiedError'
[egg-scripts] pid: 442
[egg-scripts]
[egg-scripts] 2018-01-22 11:10:08,958 ERROR 442 [master] app_worker#1:458 start fail, exiting with code:1
[egg-scripts] 2018-01-22 11:10:08,959 ERROR 442 [master] exit with code:1
[egg-scripts] 2018-01-22 11:10:08,961 ERROR 448 [agent_worker] receive disconnect event on child_process fork mode, exiting with code:110
[egg-scripts] 2018-01-22 11:10:08,961 ERROR 448 [agent_worker] exit with code:110

那个超时是 ClusterClient 内部通讯的,目前猜测是你在启动期的 worker 里面,做了一些比较占 CPU 的操作,导致启动超时。建议检查下代码,尤其是 beforeStart 这种地方。

或者提供一个最小可复现代码库,目前这些错误信息不足判断。

我也遇到了同样的问题,没有写过什么beforeStart 这些,我还尝试了 重新新建了一个项目,然后直接start 也是一样,我的服务器环境是阿里云的,单核cpu 我怀疑是不是配置的问题?

试下加上 --workers= 参数,Node 在某些云服务的 cpu.length 返回值有 bug。

[egg-scripts] Wait Start: 35...
[egg-scripts] 2018-01-23 16:12:03,863 ERROR 3941 [-/127.0.0.1/-/0ms GET /] nodejs.Error: [ClusterClient] leader does not be active in 30000ms on port:45728
[egg-scripts] at Function.waitFor (/root/egg/node_modules/[email protected]@cluster-client/lib/server.js:239:15)
[egg-scripts] at waitFor.next ()
[egg-scripts] at onFulfilled (/root/egg/node_modules/[email protected]@co/index.js:65:19)
[egg-scripts] at
[egg-scripts] at process._tickCallback (internal/process/next_tick.js:160:7)
[egg-scripts]
[egg-scripts] pid: 3941
[egg-scripts] hostname: iZ2ze76p2wnibfx0j7dslmZ
[egg-scripts]
[egg-scripts] 2018-01-23 16:12:03,863 ERROR 3941 nodejs.Error: [ClusterClient] leader does not be active in 30000ms on port:45728
[egg-scripts] at Function.waitFor (/root/egg/node_modules/[email protected]@cluster-client/lib/server.js:239:15)
[egg-scripts] at waitFor.next ()
[egg-scripts] at onFulfilled (/root/egg/node_modules/[email protected]@co/index.js:65:19)
[egg-scripts] at
[egg-scripts] at process._tickCallback (internal/process/next_tick.js:160:7)
[egg-scripts]
[egg-scripts] pid: 3941
[egg-scripts] hostname: iZ2ze76p2wnibfx0j7dslmZ
[egg-scripts]
[egg-scripts] 2018-01-23 16:12:03,863 ERROR 3941 [app_worker] start error, exiting with code:1
[egg-scripts] 2018-01-23 16:12:03,863 ERROR 3941 [app_worker] exit with code:1
[egg-scripts] 2018-01-23 16:12:03,869 ERROR 3925 nodejs.AppWorkerDiedError: [master] app_worker#1:3941 died (code: 1, signal: null, suicide: false, state: dead), current workers: []
[egg-scripts] at Master.onAppExit (/root/egg/node_modules/[email protected]@egg-cluster/lib/master.js:384:21)
[egg-scripts] at Master.emit (events.js:159:13)
[egg-scripts] at Messenger.sendToMaster (/root/egg/node_modules/[email protected]@egg-cluster/lib/utils/messenger.js:122:17)
[egg-scripts] at Messenger.send (/root/egg/node_modules/[email protected]@egg-cluster/lib/utils/messenger.js:87:12)
[egg-scripts] at EventEmitter.cluster.on (/root/egg/node_modules/[email protected]@egg-cluster/lib/master.js:263:22)
[egg-scripts] at EventEmitter.emit (events.js:164:20)
[egg-scripts] at ChildProcess.worker.process.once (internal/cluster/master.js:185:13)
[egg-scripts] at Object.onceWrapper (events.js:254:19)
[egg-scripts] at ChildProcess.emit (events.js:159:13)
[egg-scripts] at Process.ChildProcess._handle.onexit (internal/child_process.js:209:12)
[egg-scripts] name: 'AppWorkerDiedError'
[egg-scripts] pid: 3925
[egg-scripts] hostname: iZ2ze76p2wnibfx0j7dslmZ
[egg-scripts]
[egg-scripts] 2018-01-23 16:12:03,870 ERROR 3925 [master] app_worker#1:3941 start fail, exiting with code:1
[egg-scripts] 2018-01-23 16:12:03,871 ERROR 3925 [master] exit with code:1
[egg-scripts] 2018-01-23 16:12:03,873 ERROR 3931 [agent_worker] receive disconnect event on child_process fork mode, exiting with code:110
[egg-scripts] 2018-01-23 16:12:03,873 ERROR 3931 [agent_worker] exit with code:110
[egg-scripts]
[egg-scripts] Start got error, see /root/logs/master-stderr.log
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] start: egg-scripts start --workers=1 --daemon
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR! /root/.npm/_logs/2018-01-23T08_12_05_860Z-debug.log
刚刚测试了一下
加上了 --workers=1 还是不行
求大神支招~~~
还是新项目,没有加过任何内容的

删除依赖重新安装下看看,貌似你的依赖都挺旧的,不要锁版本。

本地能否复现?

刚刚删除了node_modules 重新npm i 安装之后(安装过程没有报错,也没有锁定版本)但是还是在报这个错。。。

提供本地最小可复现仓库吧,这样隔空猜不出

最小可复现仓库?,我试过直接在阿里云服务器上按照下面教程安装启动,代码没动,但是照样报同样的错误,实在找不到原因,一直很奇怪“ERROR 9248 nodejs.Error: [ClusterClient] leader does not be active in 30000ms on port:45188”这个是什么错误,这个端口哪里来的,而且每一次启动错误的端口都不一样?

$ npm i egg-init -g
$ egg-init egg-example --type=simple
$ cd egg-example
$ npm install
$npm start

本地是否能复现?

本地是window系统npm run dev正常启动放到服务器上npm start就不行了

仓库我没搞好。。。但是确实如楼上所说。。。就是按照教程创建的一个新项目 什么都没改过 都会报这个错。。。

本地windows 直接npm start 都是对的 都可以访问。。。到linux就不行了。。。ubuntu16

我只是初始化项目而已,没有加任何代码

能否提供一个环境让我看看?

测试的服务器是Ubuntu 16.04 64位

@jy418 @zrxisme 周末会帮你们看看,之前我在centos自己跑没问题,阿里云的我到时候看看

我试过root在阿里云ubuntu正常

npm i egg-init -g
egg-init egg-example --type=simple
cd egg-example
npm install
npm start

@zrxisme 试一下root用户运行?我看你是$,再帮忙发一下node版本

我服务侦听的是3000端口,为何下面会有个45728端口的错误,而且错误的端口还会变化,如何让线上部署的程序只启动侦听服务的3000端口或者让45728这个端口不变,因为我的iptables规则没有让45728这个端口通过,所以启动失败,但iptables规则加上这个端口后,错误的端口又变了,所以一直启动不了,只有清除iptables规则后才能启动成功,那位大佬可以解答?求帮忙

egg-scripts] 2018-01-23 16:12:03,863 ERROR 3941 [-/127.0.0.1/-/0ms GET /] nodejs.Error: [ClusterClient] leader does not be active in 30000ms on port:45728

@zrxisme 我这边切阿里云切换下系统镜像看下,跟着把你iptables配置的规则发上来看看

我iptables规则只开通了eggjs侦听的3000端口和服务器登陆端口

不知道为什么npm start启动eggjs后除了侦听的3000端口外还会有其他端口被启动,其他端口没通过iptables规则所以启动失败,其他端口哪里来的不知道,怀疑是socket问题

@zrxisme lsof -i:端口号 看下端口被什么占用了

。。。端口根本就没通过iptables规则,怎么会占用呢,你自己试下配置iptables规则,除了侦听的端口以外,其他端口不予通过,然后启动egg你就可以看到报什么错误了

我看过 报错的端口号之前没有被占用过,应该就是iptables配置 ,有人试试删除iptables的配置可以成功么?

我试过,清除iptables规则,启动成功

额 那这样的话 感觉有点不爽额 没有iptables 有裸奔的感觉

对,没有根本解决问题,额外端口问题还是解决不了

话说有没有试过不用egg-script启动 换成pm2启动?

我用pm2的ecosystem.json的方式部署过,iptables规则没清除,也启动不了

端口问题推测出在这个插件里cluster-client,只是找了很久都没找到解决方案

@JacksonTian: 能否提供一个环境让我看看?

https://github.com/eggjs/egg/issues/2004#issuecomment-359729397

@zrxisme

目前来看就是因为设置Iptables导致跑不起来 附上我的相关配置 8081是egg监听的端口
-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A OUTPUT -j ACCEPT
-A INPUT -p tcp --dport 443 -j ACCEPT
-A INPUT -p tcp --dport 80 -j ACCEPT
-A INPUT -p tcp --dport 8081 -j ACCEPT
-A INPUT -p tcp -m state --state NEW --dport 39999 -j ACCEPT
-A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT
-A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables denied:" --log-level 7
-A INPUT -p tcp --dport 80 -i eth0 -m state --state NEW -m recent --set
-A INPUT -p tcp --dport 80 -i eth0 -m state --state NEW -m recent --update --seconds 60 --hitcount 150 -j DROP
-A INPUT -j REJECT
-A FORWARD -j REJECT
COMMIT

按理说内部端口不会被拦截的吧? @JacksonTian

但是楼主亲测,清除iptables就可以启动了的说。。。

@fenis 等同学试了没?能否复现?

iptables 用得不多,不过我觉得你这个 policy 没有允许 loopbackINPUT 有点多余了。试试下面的配置。

-A INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A OUTPUT -j ACCEPT
-A INPUT -i lo -j ACCEPT
-A INPUT -p tcp --dport 443 -j ACCEPT
-A INPUT -p tcp --dport 80 -j ACCEPT
-A INPUT -p tcp --dport 8081 -j ACCEPT
-A INPUT -p tcp -m state --state NEW --dport 39999 -j ACCEPT
-A INPUT -p icmp -m icmp --icmp-type 8 -j ACCEPT
-A INPUT -m limit --limit 5/min -j LOG --log-prefix "iptables denied:" --log-level 7
-A INPUT -p tcp --dport 80 -i eth0 -m state --state NEW -m recent --set
-A INPUT -p tcp --dport 80 -i eth0 -m state --state NEW -m recent --update --seconds 60 --hitcount 150 -j DROP
-A INPUT -j REJECT
-A FORWARD -j REJECT
COMMIT

@zrxisme 阿里云上目前无法复现,ecs ubuntu16 node 8.9 关于防火墙这块,阿里云直接在安全组配置端口就可以,建议你直接配置一个可用端口区间

看来问题属于 iptable 这块的问题了,可以自己排查下了。

egg 有 cluster-client 来做进程间的高级通信,所以会开一些端口,但这些端口只是内部通信的。

楼主最后怎么解决的?最近在学egg,部署到服务器也遇到跟你同样的问题。。。搞了好几天了。。。没搞明白 @zrxisme

@atian25 遇到同样的问题

@akumaLin @imshenshen https://github.com/eggjs/egg/issues/2004#issuecomment-361878130 试过按照这个解决吗?

试试临时关掉 iptables

sudo service iptables stop

iptables增加一条规则,允许本地回环接口(即运行本机访问本机):
-A INPUT -s 127.0.0.1 -d 127.0.0.1 -j ACCEPT

Was this page helpful?
0 / 5 - 0 ratings

Related issues

zlab picture zlab  ·  55Comments

atian25 picture atian25  ·  68Comments

fengmk2 picture fengmk2  ·  51Comments

popomore picture popomore  ·  59Comments

kkys4bfgp75be9p picture kkys4bfgp75be9p  ·  36Comments