Zwavejs2mqtt: [bug] Unable to mark dead node as failed node to remove it

Created on 22 Jan 2021  路  25Comments  路  Source: zwave-js/zwavejs2mqtt

Before submitting a bug please read: https://zwave-js.github.io/zwavejs2mqtt/#/troubleshooting/bug_report

Version

Build/Run method

  • [X] Docker
  • [ ] PKG
  • [ ] Manually built (git clone - npm install - npm run build )

docker image: zwavejs/zwavejs2mqtt:dev (0c5cfa16d4a2)
zwavejs2mqtt version: 1.0.0-rc.1
zwavejs version: 6.0.0

Describe the bug

When trying to remove a dead not, marking it as failed and then removing the failed node, I get the following error:

Node 45 is not in the list of failed nodes removeFailedNode

To Reproduce

Steps to reproduce the behavior:

  1. Click on the dead node
  2. Select "Is Failed Node" from the drop-down menu and click Send
  3. Observe successful API call with info notice
  4. Click on the node again and select "Remove failed node" and click Send
  5. Observe failed API call with error notice

Expected behavior

The failed node should be removed.

Additional context

2021-01-22 10:44:42.737 INFO APP: Zwave api call: isFailedNode [ 45, [length]: 1 ]
2021-01-22 10:44:42.748 INFO ZWAVE: Success zwave api call isFailedNode
2021-01-22 10:44:54.021 INFO APP: Zwave api call: removeFailedNode [ 45, [length]: 1 ]
2021-01-22 10:44:55.243 INFO ZWAVE: The node removal process could not be started due to the following reasons:
路 Node 45 is not in the list of failed nodes removeFailedNode
bug

All 25 comments

Screenshot 2021-01-22 at 12 28 52

@larstobi I think that if you try to use zwavejs 6.0.1 it should work. It has been released yesterday and I forgot to bump it! It has added a ping to automatically detect a node as failed when calling replace/remove failed. Could you try?

Yes, trying now, will report back in an hour or two.

@robertsLando I have now tried again with 6.0.1, but it still doesn't get removed, although it behaves differently.

In the log:

2021-01-22 15:26:00.268 INFO APP: Zwave api call: isFailedNode [ 45, [length]: 1 ]
2021-01-22 15:26:16.154 INFO APP: Zwave api call: removeFailedNode [ 45, [length]: 1 ]

And I get no notice in the WebUI if the command failed or succeeded.
The node is still there, marked as Dead instead of Removed.

@larstobi I suggest to also attach the zwavejs log so we can see what's going on.

cc @AlCalzone

As always, if something that communicates with the device or controller is not working I need zwave-js logs.

@AlCalzone @robertsLando Is this sufficient?

Can check when I'm back on the computer

@larstobi not enough, sorry :(
Two problems:

  1. The loglevel isn't high enough (should be silly)
  2. Looks like your stick does something weird:
12:21:24.560 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:22:50.688 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:23:22.439 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:25:48.691 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:25:51.107 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:26:51.346 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:27:04.498 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:27:08.963 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:27:13.533 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:27:24.633 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:27:40.820 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:28:20.977 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:29:51.256 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:31:02.196 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.
12:32:30.988 CNTRLR   Failed to execute controller command after 1/3 attempts. Scheduling next try i
                      n 100 ms.

I have since enabled debug logs. Will upload those soon, and then enable silly.

Debug _might_ be enough I think.

@AlCalzone Here you go! Let me know if debug wasn't enough. :-)

zwavejs2mqtt.log.gz
zwavejs_1.log.gz

I tried to mark failed and remove failed nodes 45 and 92, just prior to the timestamps "20:11:40".

I'm not seeing that tbh.
Remove failed first pings the node (I'm not seeing that message go out), and if that comes back negative (because otherwise the node cannot be removed), sends a RemoveFailedNodeRequest to the controller. The only instance of RemoveFailed is in the controller's list of supported commands.

Hmm, that is weird. So, basically the node ping doesn't go out.

Would a new silly log help?

Wait.... Now, when I refreshed the page of zwavejs2mqtt both Node 45 and Node 92 are in status Removed.

@robertsLando Can you make any sense of this?
zwave-js does not have a status "removed"

Removed is the state I use for nodes that doesn't exists, them are hidden by default, you need to enable the flag "show hidden nodes" in order to see them

I didn't do anything further. So, something must have worked.

This is the last log line from Node 045 before it was never heard from again:

20:12:41.120 CNTRLR   [Node 045] [Configuration] 8: metadata updated                    [Endpoint 0]

In the zj2m log it says:

grep "Node 45" zwavejs2mqtt.log
2021-01-22 18:06:54.472 DEBUG ZWAVE: Node 45 has been added to nodes array
2021-01-22 18:09:21.781 INFO ZWAVE: Node 45 is dead
2021-01-22 20:12:41.049 INFO ZWAVE: Node 45: value removed: 49-0-Air temperature
2021-01-22 20:12:41.051 INFO ZWAVE: Node 45: value removed: 113-0-Power Management-Battery maintenance status
[...]
2021-01-22 20:12:41.080 INFO ZWAVE: Node 45: value removed: 113-0-System-Hardware status
2021-01-22 20:12:41.082 INFO ZWAVE: Node 45: value removed: 112-0-1
[...]
2021-01-22 20:12:41.092 INFO ZWAVE: Node 45: value removed: 112-0-8
2021-01-22 20:12:41.094 INFO ZWAVE: Node 45: metadata updated: 49-0-Air temperature
[...]
2021-01-22 20:12:41.104 INFO ZWAVE: Node 45: metadata updated: 113-0-Power Management-Battery maintenance status
2021-01-22 20:12:41.106 INFO ZWAVE: Node 45: metadata updated: 113-0-System-Hardware status
2021-01-22 20:12:41.108 INFO ZWAVE: Node 45: metadata updated: 112-0-1
[...]
2021-01-22 20:12:41.121 INFO ZWAVE: Node 45: metadata updated: 112-0-8
2021-01-22 20:12:41.241 INFO ZWAVE: Node 45: removed

So, maybe it wasn't so dead afterall, but still managed to get removed.

Node 92 is 100% dead, I have physically verified it. It got removed a good while after I tried to remove it, like 49 minutes after:

2021-01-22 18:06:54.559 DEBUG ZWAVE: Node 92 has been added to nodes array
2021-01-22 18:19:42.248 INFO ZWAVE: Node 92 is asleep
2021-01-22 21:04:46.705 INFO ZWAVE: Node 92: removed

Hm there's definitely something funky going on. The whole removal process should be a matter of seconds.

@AlCalzone suggestions about how to debug this or about the root cause? Is it something on my side?

No idea. Doesn't seem like the zwave-js api gets called

I'm afraid I have no further way to debug this since both nodes have been removed in the end. At least not until next time I get a dead node or two.

I'd say we close this for now. Feel free to reopen when you have the problem again.

Was this page helpful?
0 / 5 - 0 ratings