Zigbee2mqtt: After "zigbee2mqtt version 1.10.0" stopped due to "adapter disconnecting", comes up with "empty" database.db

Created on 29 Mar 2020  路  9Comments  路  Source: Koenkk/zigbee2mqtt

What happened

"zigbee2mqtt" version 1.10.0 running on bare metal Linux. Communicating with an ESP12F running ESPEasy "network to serial device" to a physically connected "CC2530 + CC2591" running "CC2530_CC2591_DEFAULT_20190608".

Service had already 2 x Xiaomi Mijia Temp/Humidity sensors,one Mijia button, one Mijia door sensor, and one PHILIPS Hue Bulb paired and working for weeks.

Service seemed to lose connection with the adapter, hence stopped, and tried to start again:

error 2020-03-29 16:06:24: Adapter disconnected, stopping
info  2020-03-29 16:06:24: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
debug 2020-03-29 16:06:24: Loaded state from file /opt/zigbee2mqtt/data/state.json
info  2020-03-29 16:06:24: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
info  2020-03-29 16:06:24: Starting zigbee-herdsman...

After several (automatic) attempts at starting up, zigbee2mqtt came up again, but with no devices paired:

error 2020-03-29 16:09:44: Error while starting zigbee-herdsman
info  2020-03-29 16:09:45: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
debug 2020-03-29 16:09:45: Loaded state from file /opt/zigbee2mqtt/data/state.json
info  2020-03-29 16:09:45: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
info  2020-03-29 16:09:45: Starting zigbee-herdsman...
debug 2020-03-29 16:09:45: Using zigbee-herdsman with settings: '{"network":{"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11],"networkKey":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","data
info  2020-03-29 16:10:22: zigbee-herdsman started
info  2020-03-29 16:10:22: Coordinator firmware version: '{"type":"zStack12","meta":{"transportrev":2,"product":0,"majorrel":2,"minorrel":6,"maintrel":3,"revision":20190608}}'
debug 2020-03-29 16:10:22: Zigbee network parameters: {"panID":6754,"extendedPanID":"0xdddddddddddddddd","channel":11}
info  2020-03-29 16:10:22: Currently 0 devices are joined:
info  2020-03-29 16:10:22: Zigbee: disabling joining new devices.
info  2020-03-29 16:10:23: Connecting to MQTT server at mqtt://192.168.1.10
info  2020-03-29 16:10:24: Connected to MQTT server

Upon inspection of "stat.json", it contains the correct set of devices and last stat for each sensor and device. File is being recreated every 5 minutes, last time as I was writing this bug report:

2020-03-29 21:09:46.071087810

However, "database.db" is empty, in the sense that it only contains one line, for the coordinator, and even this one doesn't match the same line in the "database.db.backup" copy, which still has all devices. The differences in the "coordinator" line in the database file are restricted to "epList":

AFTER RESTART  : "epList":[12,110,11,8,6,5,4,3,2,1]"
BEFORE RESTART : "epList":[1,2,3,4,5,6,8,11,12,110]"

Also, both the database.db and the backup copy have the same last modification time (output from "stat"):

  File: database.db
  Size: 1224            Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 1048846     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-29 20:30:14.834817885 +0200
Modify: 2020-03-29 16:10:22.879043419 +0200
Change: 2020-03-29 16:10:22.879043419 +0200
 Birth: -

  File: database.db.backup
  Size: 5550            Blocks: 16         IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 1048582     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-29 20:30:14.834817885 +0200
Modify: 2020-03-29 16:10:13.050761854 +0200
Change: 2020-03-29 16:10:13.050761854 +0200

As a result, even if telemetry data from sensors continues to be accepted, can't action on the PHILIPS Hue, getting errors like the ones below:

debug 2020-03-29 20:26:50: Received MQTT message on 'zigbee2mqtt/0x001788010600ae79/set' with data '{"state": "ON"}'
info  2020-03-29 20:26:50: MQTT publish: topic 'zigbee2mqtt/bridge/log', payload '{"type":"entity_not_found","message":{"friendly_name":"0x001788010600ae79"}}'
error 2020-03-29 20:26:50: Entity '0x001788010600ae79' is unknown

Finally, configuration.yaml was automatically updated over time as devices were joined, and still has the list of three devices there. File has not changed since zigbee2mqtt went down as in the logs presented aboce:

devices:
  '0x00158d0003747078':
    friendly_name: '0x00158d0003747078'
  '0x00158d0003cd76a7':
    friendly_name: '0x00158d0003cd76a7'
  '0x00158d00042522a8':
    friendly_name: '0x00158d00042522a8'
  '0x00158d000445180a':
    friendly_name: '0x00158d000445180a'
  '0x001788010600ae79':
    friendly_name: '0x001788010600ae79'

# stat configuration.yaml
  File: configuration.yaml
  Size: 964         Blocks: 8          IO Block: 4096   regular file
Device: fd00h/64768d    Inode: 1049225     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-29 16:06:24.764221818 +0200
Modify: 2020-03-23 13:07:07.191662939 +0100
Change: 2020-03-23 13:07:07.195663056 +0100
 Birth: -

What did you expect to happen

zigbee2mqtt should have restarted with the persistent set of joined Zigbeee devices, instead of overwriting the database file, and failing to have any of them "loaded" on startup.

Gracefully stopping the service, replacing the "empty" database.db with the backed up one, and starting the service again, makes for everything to work normally again.

How to reproduce it (minimal and precise)

Note due to a number of reasons I get quite frequent "Adapter disconnected, stopping", and this is the first time joined devices were lost due to database.db being overwritten:

# grep "Adapter disconnected, stopping" /var/log/zigbee2mqtt.log | grep "^error 2020-03" | wc -l
8

So the best possible way to try and reproduce the problem described would be as per the description above.

Debug Info

zigbee2mqtt version : zigbee2mqtt version 1.10.0 (commit #8df6be8)
CC253X firmware version : CC2530_CC2591_DEFAULT_20190608

Most helpful comment

Found the issue and fixed it. Available in latest dev branch (this was quite a tricky bug, thanks for finding!).

To you for the very prompt response and the fix, and in general , for this very useful piece of software / firmware.

All 9 comments

Correction to original post.

Despite I said "_even if telemetry data from sensors continues to be accepted_", this is incorrect, as I now realized, and during the time zigbee2mqtt was running with an "empty" database.db, no sensor data was accepted by the software and relayed over to MQTT for Home Assistant to "see" and later store in the backend (InfluDB if someone is curious).

Sorry for the confusion.

Please provide the herdsman debug logging when Zigbee2mqtt starts and the database is emptied. This will allow us to determine the cause of the database reset.

To enable herdsman debug logging, see https://www.zigbee2mqtt.io/information/debug.html#zigbee-herdsman-debug-logging

And so it happened again. There was this in the general logs which explains why zigbee2mqtt restarted:

info  2020-04-09 15:28:51: MQTT publish: topic 'zigbee2mqtt/0x001788010600ae79', payload '{"state":"OFF","linkquality":124,"brightness":254}'
debug 2020-04-09 15:28:54: Received Zigbee message from '0x001788010600ae79', type 'attributeReport', cluster 'genLevelCtrl', data '{"currentLevel":254}' from endpoint 11 with groupID 0
info  2020-04-09 15:28:54: MQTT publish: topic 'zigbee2mqtt/0x001788010600ae79', payload '{"state":"OFF","linkquality":124,"brightness":254}'
debug 2020-04-09 15:33:10: Saving state to file /opt/zigbee2mqtt/data/state.json
debug 2020-04-09 15:38:10: Saving state to file /opt/zigbee2mqtt/data/state.json
debug 2020-04-09 15:43:10: Saving state to file /opt/zigbee2mqtt/data/state.json
debug 2020-04-09 15:48:10: Saving state to file /opt/zigbee2mqtt/data/state.json
error 2020-04-09 15:49:51: Adapter disconnected, stopping
info  2020-04-09 15:49:51: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
debug 2020-04-09 15:49:51: Loaded state from file /opt/zigbee2mqtt/data/state.json
info  2020-04-09 15:49:51: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
info  2020-04-09 15:49:51: Starting zigbee-herdsman...
debug 2020-04-09 15:49:51: Using zigbee-herdsman with settings: '{"network":{"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11],"networkKey":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json","serialPort":{"baudRate":115200,"rtscts":false,"path":"tcp://192.168.1.119:1775"}}'

Took quite a few minutes for the service to get started, but it did in the end, with no devices whatsover:

info  2020-04-09 15:57:03: zigbee-herdsman started
info  2020-04-09 15:57:03: Coordinator firmware version: '{"type":"zStack12","meta":{"transportrev":2,"product":0,"majorrel":2,"minorrel":6,"maintrel":3,"revision":20190608}}'
debug 2020-04-09 15:57:04: Zigbee network parameters: {"panID":6754,"extendedPanID":"0xdddddddddddddddd","channel":11}
info  2020-04-09 15:57:04: Currently 0 devices are joined:
info  2020-04-09 15:57:04: Zigbee: disabling joining new devices.
info  2020-04-09 15:57:04: Connecting to MQTT server at mqtt://192.168.1.10
info  2020-04-09 15:57:04: Connected to MQTT server
info  2020-04-09 15:57:04: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload 'online'
info  2020-04-09 15:57:04: MQTT publish: topic 'zigbee2mqtt/bridge/config', payload '{"version":"1.10.0","commit":"8df6be8","coordinator":{"type":"zStack12","meta":{"transportrev":2,"product":0,"majorrel":2,"minorrel":6,"maintrel":3,"revision":20190608}},"log_level":"debug","permit_join":false}'

Again, the "database.db" had gone "empty" as when reporting this issue:

# pwd
/opt/zigbee2mqtt/data
# ls -l database*
-rw-r--r-- 1 root root 1224 Apr  9 15:57 database.db
-rw-r--r-- 1 root root 5550 Apr  9 15:56 database.db.backup

And this is what I saw in the logs for the extended herdsman debug logging:

Apr  9 15:49:51 automation-srv systemd[1]: zigbee2mqtt.service: Main process exited, code=exited, status=1/FAILURE
Apr  9 15:49:51 automation-srv systemd[1]: zigbee2mqtt.service: Failed with result 'exit-code'.
Apr  9 15:49:51 automation-srv systemd[1]: zigbee2mqtt.service: Service RestartSec=100ms expired, scheduling restart.
Apr  9 15:49:51 automation-srv systemd[1]: zigbee2mqtt.service: Scheduled restart job, restart counter is at 158.
Apr  9 15:49:51 automation-srv systemd[1]: Stopped zigbee2mqtt.
Apr  9 15:49:51 automation-srv systemd[1]: Started zigbee2mqtt.
Apr  9 15:49:51 automation-srv npm[15395]: > [email protected] start /opt/zigbee2mqtt
Apr  9 15:49:51 automation-srv npm[15395]: > node index.js
Apr  9 15:49:51 automation-srv npm[15395]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:49:51: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
Apr  9 15:49:51 automation-srv npm[15395]: #033[34mzigbee2mqtt:debug#033[39m 2020-04-09 15:49:51: Loaded state from file /opt/zigbee2mqtt/data/state.json
Apr  9 15:49:51 automation-srv npm[15395]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:49:51: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
Apr  9 15:49:51 automation-srv npm[15395]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:49:51: Starting zigbee-herdsman...
Apr  9 15:49:51 automation-srv npm[15395]: #033[34mzigbee2mqtt:debug#033[39m 2020-04-09 15:49:51: Using zigbee-herdsman with settings: '{"network":{"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11],"networkKey":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json","serialPort":{"baudRate":115200,"rtscts":false,"path":"tcp://192.168.1.119:1775"}}'
Apr  9 15:49:52 automation-srv npm[15395]: 2020-04-09T13:49:52.004Z zigbee-herdsman:controller:log Starting with options '{"network":{"networkKeyDistribute":false,"networkKey":[14,7,113,175,150,135,250,142,253,164,2,8,31,244,112,78],"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11]},"serialPort":{"baudRate":115200,"rtscts":false,"path":"tcp://192.168.1.119:1775"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json"}'
Apr  9 15:49:52 automation-srv npm[15395]: 2020-04-09T13:49:52.005Z zigbee-herdsman:adapter:zStack:znp:log Opening TCP socket with 192.168.1.119:1775
Apr  9 15:49:59 automation-srv npm[15395]: 2020-04-09T13:49:59.392Z zigbee-herdsman:adapter:zStack:znp:log Socket connected
Apr  9 15:49:59 automation-srv npm[15395]: 2020-04-09T13:49:59.392Z zigbee-herdsman:adapter:zStack:znp:log Socket ready

But it doesn't seem to be until the process times out and tries to start up (and later succeeds) when the database is "cleared", as per the logs:

Apr  9 15:56:07 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:56:07: Logging to console and directory: '/var/log/' filename: zigbee2mqtt.log
Apr  9 15:56:07 automation-srv npm[31536]: #033[34mzigbee2mqtt:debug#033[39m 2020-04-09 15:56:07: Loaded state from file /opt/zigbee2mqtt/data/state.json
Apr  9 15:56:07 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:56:07: Starting zigbee2mqtt version 1.10.0 (commit #8df6be8)
Apr  9 15:56:07 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:56:07: Starting zigbee-herdsman...
Apr  9 15:56:07 automation-srv npm[31536]: #033[34mzigbee2mqtt:debug#033[39m 2020-04-09 15:56:07: Using zigbee-herdsman with settings: '{"network":{"panID":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11],"network
Key":"HIDDEN"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json","serialPort":{"baudRate":115200,"rtscts":false,
"path":"tcp://192.168.1.119:1775"}}'
Apr  9 15:56:07 automation-srv npm[31536]: 2020-04-09T13:56:07.706Z zigbee-herdsman:controller:log Starting with options '{"network":{"networkKeyDistribute":false,"networkKey":[14,7,113,175,150,135,250,142,253,164,2,8,31,244,112,78],"panI
D":6754,"extendedPanID":[221,221,221,221,221,221,221,221],"channelList":[11]},"serialPort":{"baudRate":115200,"rtscts":false,"path":"tcp://192.168.1.119:1775"},"databasePath":"/opt/zigbee2mqtt/data/database.db","databaseBackupPath":"/opt/
zigbee2mqtt/data/database.db.backup","backupPath":"/opt/zigbee2mqtt/data/coordinator_backup.json"}'
Apr  9 15:56:07 automation-srv npm[31536]: 2020-04-09T13:56:07.707Z zigbee-herdsman:adapter:zStack:znp:log Opening TCP socket with 192.168.1.119:1775
Apr  9 15:56:08 automation-srv npm[31536]: 2020-04-09T13:56:08.189Z zigbee-herdsman:adapter:zStack:znp:log Socket connected
...
Apr  9 15:56:54 automation-srv npm[31536]: 2020-04-09T13:56:54.403Z zigbee-herdsman:controller:log Started with result 'reset'
Apr  9 15:56:54 automation-srv npm[31536]: 2020-04-09T13:56:54.403Z zigbee-herdsman:controller:log Injected database: true, adapter: true
Apr  9 15:56:54 automation-srv npm[31536]: 2020-04-09T13:56:54.406Z zigbee-herdsman:controller:log Clearing database...
...
Apr  9 15:57:03 automation-srv npm[31536]: 2020-04-09T13:57:03.331Z zigbee-herdsman:adapter:zStack:unpi:parser --- parseNext []
Apr  9 15:57:03 automation-srv npm[31536]: 2020-04-09T13:57:03.332Z zigbee-herdsman:controller:log No coordinator in database, querying...
Apr  9 15:57:03 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:03: zigbee-herdsman started
Apr  9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: Currently 0 devices are joined:
Apr  9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: Zigbee: disabling joining new devices.
Apr  9 15:57:04 automation-srv npm[31536]: 2020-04-09T13:57:04.304Z zigbee-herdsman:controller:log Disable joining
Apr  9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: Connecting to MQTT server at mqtt://192.168.1.10
Apr  9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: Connected to MQTT server
Apr  9 15:57:04 automation-srv npm[31536]: #033[32mzigbee2mqtt:info #033[39m 2020-04-09 15:57:04: MQTT publish: topic 'zigbee2mqtt/bridge/state', payload 'online'

Of course, I have summarized the very verbose logs but the "zigbee-herdsman:adapter:zStack" logs are there, in case they are necessary, as it is as if during initalization there is some result after zigbee2mqtt chatting with devices which results in zigbee2mqtt voluntarily (and intentionally) resetting / clearing up the device database.

As there is one device as the following in the network :

info  2020-04-04 11:46:41: 0x001788010600ae79 (0x001788010600ae79): 8718699673147 - Philips Hue white A60 bulb E27 bluetooth (Router)

Which has the following definition within the database.db.backup :

{"id":6,"type":"Router","ieeeAddr":"0x001788010600ae79","nwkAddr":57427,"manufId":4107,"manufName":"Philips","powerSource":"Mains (single phase)","modelId":"LWA001","epList":[11,242],"endpoints":{"11":{"profId":260,"epId":11,"devId":257,"inClusterList":[0,3,4,5,6,8,4096,64514],"outClusterList":[25],"clusters":{"genBasic":{"attributes":{"modelId":"LWA001","manufacturerName":"Philips","powerSource":1,"zclVersion":2,"appVersion":2,"stackVersion":1,"hwVersion":1,"dateCode":"20190122","swBuildId":"1.53.3_r27175"}},"genOnOff":{"attributes":{"onOff":0}},"genLevelCtrl":{"attributes":{"currentLevel":254}}},"binds":[{"cluster":6,"type":"endpoint","deviceIeeeAddress":"0x00124b001f3205e7","endpointID":1},{"cluster":8,"type":"endpoint","deviceIeeeAddress":"0x00124b001f3205e7","endpointID":1}]},"242":{"profId":41440,"epId":242,"devId":97,"inClusterList":[],"outClusterList":[33],"clusters":{},"binds":[]}},"appVersion":2,"stackVersion":1,"hwVersion":1,"dateCode":"20190122","swBuildId":"1.53.3_r27175","zclVersion":2,"interviewCompleted":true,"meta":{"reporting":1},"lastSeen":1586438934663}

Could this be a result of the Zigbee network going through some election process and the bulb "winning"?

Finally, note the "lastSeen" above, is as below in the server's timezone:

# date -d @1586438934.663
Thu Apr  9 15:28:54 CEST 2020

If you need the full logs, please ask and indicate the most appropriate way to share them with you.

Thank you.

At the moment I'm mostly interested by the log having the tag zigbee-herdsman:adapter:zStack:startZnp could you post these?

Here you have them, since the moment "zigbee2mqtt" goes down due to losing connection with the remote hardware, to after it has successfully started up (with the cleared database) later on.

See attachment.
zigbee2mqtt_debug_logs.log

Found the issue and fixed it. Available in latest dev branch (this was quite a tricky bug, thanks for finding!).

Found the issue and fixed it. Available in latest dev branch (this was quite a tricky bug, thanks for finding!).

To you for the very prompt response and the fix, and in general , for this very useful piece of software / firmware.

I confirm the problem exists - database cleared after zigbee2mqtt loses connection to coordinator (simple router reboot isn't the cause, only several minutes of disconnection produce the problem and not every time). After restoring database.db.backup everything is fine.
Coordinator - D1 mini + ESPEasy "Serial Server" + E18-MS1PA1-IPX CC2530 CC2592 + SOURCE_ROUTING_CC2530_CC2592_20190619.
zigbee2mqtt 1.12.2
Home Assistant 0.108.9

Without this problem everything works stable with 15 devices + routers. Thank you for great software.

Should be fixed in zigbee2mqtt 1.13.0, if not feel free to reopen.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

ophilips picture ophilips  路  4Comments

andreasbrett picture andreasbrett  路  3Comments

RefineryX picture RefineryX  路  4Comments

Koenkk picture Koenkk  路  3Comments

CodeFinder2 picture CodeFinder2  路  4Comments