"zigbee2mqtt" version 1.10.0 running on bare metal Linux. Communicating with an ESP12F running ESPEasy "network to serial device" to a physically connected "CC2530 + CC2591" running "CC2530_CC2591_DEFAULT_20190608".
Service had already 2 x Xiaomi Mijia Temp/Humidity sensors,one Mijia button, one Mijia door sensor, and one PHILIPS Hue Bulb paired and working for weeks.
Service seemed to lose connection with the adapter, hence stopped, and tried to start again:
error 2020-03-29 16:06:24: Adapter disconnected, stopping
info 2020-03-29 16:06:24: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
debug 2020-03-29 16:06:24: Loaded state from file /opt/zigbee2mqtt/data/state.json
info 2020-03-29 16:06:24: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
info 2020-03-29 16:06:24: Starting zigbee-herdsman...
After several (automatic) attempts at starting up, zigbee2mqtt came up again, but with no devices paired:
error 2020-03-29 16:09:44: Error while starting zigbee-herdsman
info 2020-03-29 16:09:45: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
debug 2020-03-29 16:09:45: Loaded state from file /opt/zigbee2mqtt/data/state.json
info 2020-03-29 16:09:45: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
info 2020-03-29 16:09:45: Starting zigbee-herdsman...
debug 2020-03-29 16:09:45: Using zigbee-herdsman with settings: '{"network":{"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11],"networkKey":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","data
info 2020-03-29 16:10:22: zigbee-herdsman started
info 2020-03-29 16:10:22: Coordinator firmware version: '{"type":"zStack12","meta":{"transportrev":2,"product":0,"majorrel":2,"minorrel":6,"maintrel":3,"revision":20190608}}'
debug 2020-03-29 16:10:22: Zigbee network parameters: {"panID":6754,"extendedPanID":"0xdddddddddddddddd","channel":11}
info 2020-03-29 16:10:22: Currently 0 devices are joined:
info 2020-03-29 16:10:22: Zigbee: disabling joining new devices.
info 2020-03-29 16:10:23: Connecting to MQTT server at mqtt://192.168.1.10
info 2020-03-29 16:10:24: Connected to MQTT server
Upon inspection of "stat.json", it contains the correct set of devices and last stat for each sensor and device. File is being recreated every 5 minutes, last time as I was writing this bug report:
2020-03-29 21:09:46.071087810
However, "database.db" is empty, in the sense that it only contains one line, for the coordinator, and even this one doesn't match the same line in the "database.db.backup" copy, which still has all devices. The differences in the "coordinator" line in the database file are restricted to "epList":
AFTER RESTART : "epList":[12,110,11,8,6,5,4,3,2,1]"
BEFORE RESTART : "epList":[1,2,3,4,5,6,8,11,12,110]"
Also, both the database.db and the backup copy have the same last modification time (output from "stat"):
File: database.db
Size: 1224 Blocks: 8 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 1048846 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-03-29 20:30:14.834817885 +0200
Modify: 2020-03-29 16:10:22.879043419 +0200
Change: 2020-03-29 16:10:22.879043419 +0200
Birth: -
File: database.db.backup
Size: 5550 Blocks: 16 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 1048582 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-03-29 20:30:14.834817885 +0200
Modify: 2020-03-29 16:10:13.050761854 +0200
Change: 2020-03-29 16:10:13.050761854 +0200
As a result, even if telemetry data from sensors continues to be accepted, can't action on the PHILIPS Hue, getting errors like the ones below:
debug 2020-03-29 20:26:50: Received MQTT message on 'zigbee2mqtt/0x001788010600ae79/set' with data '{"state": "ON"}'
info 2020-03-29 20:26:50: MQTT publish: topic 'zigbee2mqtt/bridge/log', payload '{"type":"entity_not_found","message":{"friendly_name":"0x001788010600ae79"}}'
error 2020-03-29 20:26:50: Entity '0x001788010600ae79' is unknown
Finally, configuration.yaml was automatically updated over time as devices were joined, and still has the list of three devices there. File has not changed since zigbee2mqtt went down as in the logs presented aboce:
devices:
'0x00158d0003747078':
friendly_name: '0x00158d0003747078'
'0x00158d0003cd76a7':
friendly_name: '0x00158d0003cd76a7'
'0x00158d00042522a8':
friendly_name: '0x00158d00042522a8'
'0x00158d000445180a':
friendly_name: '0x00158d000445180a'
'0x001788010600ae79':
friendly_name: '0x001788010600ae79'
# stat configuration.yaml
File: configuration.yaml
Size: 964 Blocks: 8 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 1049225 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-03-29 16:06:24.764221818 +0200
Modify: 2020-03-23 13:07:07.191662939 +0100
Change: 2020-03-23 13:07:07.195663056 +0100
Birth: -
zigbee2mqtt should have restarted with the persistent set of joined Zigbeee devices, instead of overwriting the database file, and failing to have any of them "loaded" on startup.
Gracefully stopping the service, replacing the "empty" database.db with the backed up one, and starting the service again, makes for everything to work normally again.
Note due to a number of reasons I get quite frequent "Adapter disconnected, stopping", and this is the first time joined devices were lost due to database.db being overwritten:
# grep "Adapter disconnected, stopping" /var/log/zigbee2mqtt.log | grep "^error 2020-03" | wc -l
8
So the best possible way to try and reproduce the problem described would be as per the description above.
zigbee2mqtt version : zigbee2mqtt version 1.10.0 (commit #8df6be8)
CC253X firmware version : CC2530_CC2591_DEFAULT_20190608
Correction to original post.
Despite I said "_even if telemetry data from sensors continues to be accepted_", this is incorrect, as I now realized, and during the time zigbee2mqtt was running with an "empty" database.db, no sensor data was accepted by the software and relayed over to MQTT for Home Assistant to "see" and later store in the backend (InfluDB if someone is curious).
Sorry for the confusion.
Please provide the herdsman debug logging when Zigbee2mqtt starts and the database is emptied. This will allow us to determine the cause of the database reset.
To enable herdsman debug logging, see https://www.zigbee2mqtt.io/information/debug.html#zigbee-herdsman-debug-logging
And so it happened again. There was this in the general logs which explains why zigbee2mqtt restarted:
info 2020-04-09 15:28:51: MQTT publish: topic 'zigbee2mqtt/0x001788010600ae79', payload '{"state":"OFF","linkquality":124,"brightness":254}'
debug 2020-04-09 15:28:54: Received Zigbee message from '0x001788010600ae79', type 'attributeReport', cluster 'genLevelCtrl', data '{"currentLevel":254}' from endpoint 11 with groupID 0
info 2020-04-09 15:28:54: MQTT publish: topic 'zigbee2mqtt/0x001788010600ae79', payload '{"state":"OFF","linkquality":124,"brightness":254}'
debug 2020-04-09 15:33:10: Saving state to file /opt/zigbee2mqtt/data/state.json
debug 2020-04-09 15:38:10: Saving state to file /opt/zigbee2mqtt/data/state.json
debug 2020-04-09 15:43:10: Saving state to file /opt/zigbee2mqtt/data/state.json
debug 2020-04-09 15:48:10: Saving state to file /opt/zigbee2mqtt/data/state.json
error 2020-04-09 15:49:51: Adapter disconnected, stopping
info 2020-04-09 15:49:51: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
debug 2020-04-09 15:49:51: Loaded state from file /opt/zigbee2mqtt/data/state.json
info 2020-04-09 15:49:51: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
info 2020-04-09 15:49:51: Starting zigbee-herdsman...
debug 2020-04-09 15:49:51: Using zigbee-herdsman with settings: '{"network":{"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11],"networkKey":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json","serialPort":{"baudRate":115200,"rtscts":false,"path":"tcp://192.168.1.119:1775"}}'
Took quite a few minutes for the service to get started, but it did in the end, with no devices whatsover:
info 2020-04-09 15:57:03: zigbee-herdsman started
info 2020-04-09 15:57:03: Coordinator firmware version: '{"type":"zStack12","meta":{"transportrev":2,"product":0,"majorrel":2,"minorrel":6,"maintrel":3,"revision":20190608}}'
debug 2020-04-09 15:57:04: Zigbee network parameters: {"panID":6754,"extendedPanID":"0xdddddddddddddddd","channel":11}
info 2020-04-09 15:57:04: Currently 0 devices are joined:
info 2020-04-09 15:57:04: Zigbee: disabling joining new devices.
info 2020-04-09 15:57:04: Connecting to MQTT server at mqtt://192.168.1.10
info 2020-04-09 15:57:04: Connected to MQTT server
info 2020-04-09 15:57:04: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload 'online'
info 2020-04-09 15:57:04: MQTT publish: topic 'zigbee2mqtt/bridge/config', payload '{"version":"1.10.0","commit":"8df6be8","coordinator":{"type":"zStack12","meta":{"transportrev":2,"product":0,"majorrel":2,"minorrel":6,"maintrel":3,"revision":20190608}},"log_level":"debug","permit_join":false}'
Again, the "database.db" had gone "empty" as when reporting this issue:
# pwd
/opt/zigbee2mqtt/data
# ls -l database*
-rw-r--r-- 1 root root 1224 Apr 9 15:57 database.db
-rw-r--r-- 1 root root 5550 Apr 9 15:56 database.db.backup
And this is what I saw in the logs for the extended herdsman debug logging:
Apr 9 15:49:51 automation-srv systemd[1]: zigbee2mqtt.service: Main process exited, code=exited, status=1/FAILURE
Apr 9 15:49:51 automation-srv systemd[1]: zigbee2mqtt.service: Failed with result 'exit-code'.
Apr 9 15:49:51 automation-srv systemd[1]: zigbee2mqtt.service: Service RestartSec=100ms expired, scheduling restart.
Apr 9 15:49:51 automation-srv systemd[1]: zigbee2mqtt.service: Scheduled restart job, restart counter is at 158.
Apr 9 15:49:51 automation-srv systemd[1]: Stopped zigbee2mqtt.
Apr 9 15:49:51 automation-srv systemd[1]: Started zigbee2mqtt.
Apr 9 15:49:51 automation-srv npm[15395]: > [email protected] start /opt/zigbee2mqtt
Apr 9 15:49:51 automation-srv npm[15395]: > node index.js
Apr 9 15:49:51 automation-srv npm[15395]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:49:51: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
Apr 9 15:49:51 automation-srv npm[15395]: #033[34mzigbee2mqtt:debug#033[39m 2020-04-09 15:49:51: Loaded state from file /opt/zigbee2mqtt/data/state.json
Apr 9 15:49:51 automation-srv npm[15395]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:49:51: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
Apr 9 15:49:51 automation-srv npm[15395]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:49:51: Starting zigbee-herdsman...
Apr 9 15:49:51 automation-srv npm[15395]: #033[34mzigbee2mqtt:debug#033[39m 2020-04-09 15:49:51: Using zigbee-herdsman with settings: '{"network":{"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11],"networkKey":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json","serialPort":{"baudRate":115200,"rtscts":false,"path":"tcp://192.168.1.119:1775"}}'
Apr 9 15:49:52 automation-srv npm[15395]: 2020-04-09T13:49:52.004Z zigbee-herdsman:controller:log Starting with options '{"network":{"networkKeyDistribute":false,"networkKey":[14,7,113,175,150,135,250,142,253,164,2,8,31,244,112,78],"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11]},"serialPort":{"baudRate":115200,"rtscts":false,"path":"tcp://192.168.1.119:1775"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json"}'
Apr 9 15:49:52 automation-srv npm[15395]: 2020-04-09T13:49:52.005Z zigbee-herdsman:adapter:zStack:znp:log Opening TCP socket with 192.168.1.119:1775
Apr 9 15:49:59 automation-srv npm[15395]: 2020-04-09T13:49:59.392Z zigbee-herdsman:adapter:zStack:znp:log Socket connected
Apr 9 15:49:59 automation-srv npm[15395]: 2020-04-09T13:49:59.392Z zigbee-herdsman:adapter:zStack:znp:log Socket ready
But it doesn't seem to be until the process times out and tries to start up (and later succeeds) when the database is "cleared", as per the logs:
Apr 9 15:56:07 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:56:07: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
Apr 9 15:56:07 automation-srv npm[31536]: #033[34mzigbee2mqtt:debug#033[39m 2020-04-09 15:56:07: Loaded state from file /opt/zigbee2mqtt/data/state.json
Apr 9 15:56:07 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:56:07: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
Apr 9 15:56:07 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:56:07: Starting zigbee-herdsman...
Apr 9 15:56:07 automation-srv npm[31536]: #033[34mzigbee2mqtt:debug#033[39m 2020-04-09 15:56:07: Using zigbee-herdsman with settings: '{"network":{"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11],"network
Key":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json","serialPort":{"baudRate":115200,"rtscts":false,
"path":"tcp://192.168.1.119:1775"}}'
Apr 9 15:56:07 automation-srv npm[31536]: 2020-04-09T13:56:07.706Z zigbee-herdsman:controller:log Starting with options '{"network":{"networkKeyDistribute":false,"networkKey":[14,7,113,175,150,135,250,142,253,164,2,8,31,244,112,78],"panI
D":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11]},"serialPort":{"baudRate":115200,"rtscts":false,"path":"tcp://192.168.1.119:1775"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/
zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json"}'
Apr 9 15:56:07 automation-srv npm[31536]: 2020-04-09T13:56:07.707Z zigbee-herdsman:adapter:zStack:znp:log Opening TCP socket with 192.168.1.119:1775
Apr 9 15:56:08 automation-srv npm[31536]: 2020-04-09T13:56:08.189Z zigbee-herdsman:adapter:zStack:znp:log Socket connected
...
Apr 9 15:56:54 automation-srv npm[31536]: 2020-04-09T13:56:54.403Z zigbee-herdsman:controller:log Started with result 'reset'
Apr 9 15:56:54 automation-srv npm[31536]: 2020-04-09T13:56:54.403Z zigbee-herdsman:controller:log Injected database: true, adapter: true
Apr 9 15:56:54 automation-srv npm[31536]: 2020-04-09T13:56:54.406Z zigbee-herdsman:controller:log Clearing database...
...
Apr 9 15:57:03 automation-srv npm[31536]: 2020-04-09T13:57:03.331Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
Apr 9 15:57:03 automation-srv npm[31536]: 2020-04-09T13:57:03.332Z zigbee-herdsman:controller:log No coordinator in database, querying...
Apr 9 15:57:03 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:03: zigbee-herdsman started
Apr 9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: Currently 0 devices are joined:
Apr 9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: Zigbee: disabling joining new devices.
Apr 9 15:57:04 automation-srv npm[31536]: 2020-04-09T13:57:04.304Z zigbee-herdsman:controller:log Disable joining
Apr 9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: Connecting to MQTT server at mqtt://192.168.1.10
Apr 9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: Connected to MQTT server
Apr 9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload 'online'
Of course, I have summarized the very verbose logs but the "zigbee-herdsman:adapter:zStack" logs are there, in case they are necessary, as it is as if during initalization there is some result after zigbee2mqtt chatting with devices which results in zigbee2mqtt voluntarily (and intentionally) resetting / clearing up the device database.
As there is one device as the following in the network :
info 2020-04-04 11:46:41: 0x001788010600ae79 (0x001788010600ae79): 8718699673147 - Philips Hue white A60 bulb E27 bluetooth (Router)
Which has the following definition within the database.db.backup :
{"id":6,"type":"Router","ieeeAddr":"0x001788010600ae79","nwkAddr":57427,"manufId":4107,"manufName":"Philips","powerSource":"Mains (single phase)","modelId":"LWA001","epList":[11,242],"endpoints":{"11":{"profId":260,"epId":11,"devId":257,"inClusterList":[0,3,4,5,6,8,4096,64514],"outClusterList":[25],"clusters":{"genBasic":{"attributes":{"modelId":"LWA001","manufacturerName":"Philips","powerSource":1,"zclVersion":2,"appVersion":2,"stackVersion":1,"hwVersion":1,"dateCode":"20190122","swBuildId":"1.53.3_r27175"}},"genOnOff":{"attributes":{"onOff":0}},"genLevelCtrl":{"attributes":{"currentLevel":254}}},"binds":[{"cluster":6,"type":"endpoint","deviceIeeeAddress":"0x00124b001f3205e7","endpointID":1},{"cluster":8,"type":"endpoint","deviceIeeeAddress":"0x00124b001f3205e7","endpointID":1}]},"242":{"profId":41440,"epId":242,"devId":97,"inClusterList":[],"outClusterList":[33],"clusters":{},"binds":[]}},"appVersion":2,"stackVersion":1,"hwVersion":1,"dateCode":"20190122","swBuildId":"1.53.3_r27175","zclVersion":2,"interviewCompleted":true,"meta":{"reporting":1},"lastSeen":1586438934663}
Could this be a result of the Zigbee network going through some election process and the bulb "winning"?
Finally, note the "lastSeen" above, is as below in the server's timezone:
# date -d @1586438934.663
Thu Apr 9 15:28:54 CEST 2020
If you need the full logs, please ask and indicate the most appropriate way to share them with you.
Thank you.
At the moment I'm mostly interested by the log having the tag zigbee-herdsman:adapter:zStack:startZnp could you post these?
Here you have them, since the moment "zigbee2mqtt" goes down due to losing connection with the remote hardware, to after it has successfully started up (with the cleared database) later on.
See attachment.
zigbee2mqtt_debug_logs.log
Found the issue and fixed it. Available in latest dev branch (this was quite a tricky bug, thanks for finding!).
Found the issue and fixed it. Available in latest dev branch (this was quite a tricky bug, thanks for finding!).
To you for the very prompt response and the fix, and in general , for this very useful piece of software / firmware.
I confirm the problem exists - database cleared after zigbee2mqtt loses connection to coordinator (simple router reboot isn't the cause, only several minutes of disconnection produce the problem and not every time). After restoring database.db.backup everything is fine.
Coordinator - D1 mini + ESPEasy "Serial Server" + E18-MS1PA1-IPX CC2530 CC2592 + SOURCE_ROUTING_CC2530_CC2592_20190619.
zigbee2mqtt 1.12.2
Home Assistant 0.108.9
Without this problem everything works stable with 15 devices + routers. Thank you for great software.
Should be fixed in zigbee2mqtt 1.13.0, if not feel free to reopen.
Most helpful comment
To you for the very prompt response and the fix, and in general , for this very useful piece of software / firmware.