Deconz-rest-plugin: Tradfri Remotes suddenly stop working

Created on 13 Oct 2020  Â·  34Comments  Â·  Source: dresden-elektronik/deconz-rest-plugin

Describe the bug

Some Tradfri 5-Button remotes suddenly stop working. I had several Tradfri-remotes that just had no function anymore. Changing the battery did not help. The remotes have been added to a light group and worked for several weeks or month before just stopping to control the light group.

I tried to just reset the remotes by pressing the button 5 times, but that did not change anything.

I then did the same but with Phoscon searching for a new remote and then the remote got paired right away.

The old remote was still shown with black text (not grayed out) in the list of remotes, even after the battery has been out of the remote for several hours.

Interestingly, the old entry in the list of switches then disappeared and a new remote with the default name was shown. After changing the name back to the original name of the remote, it worked again and was already paired to the light groups with full functionality restored.

This happened now about six times with four different remotes.

Steps to reproduce the behavior

Have about 20 Tradfri-Remotes and 80 Tradfri Lights and use the network for a few months. Some remotes might just stop working.

Expected behavior

The remotes should not loose connection or the binding.

Screenshots ## Environment

image

image

image

  • Host system: Raspberry Pi 4
  • Running method: Raspbian beta image, upgraded to the newest version
  • Firmware version: 26580700
  • deCONZ version: 2.05.83
  • Device: ConBee II
  • Do you use an USB extension cable: yes
User Question

Most helpful comment

@ChrisPrefect I am waiting on your logs 😅

But as we are close: On the next Beta there should be a dedicated Log window within deconz. Easier to get logs by then.

All 34 comments

Hi,

One of the things i can advice: VirtualBox is a big nono. Users with stuff "dissapearing" often also were users of Virtualbox. No clue why.

Be adviced: Version 84 is out on Windows. that might fix it too ;)

Oh wait, the VM is for ioBroker. Deconz and Phoscon run actually on a Raspberri Pi 4. I updated the post.

Be adviced: Version 84 is out on Windows. that might fix it too ;)

I updated so many times but nothing ever got fixed for me :( And nothing in the release notes here refers to the problems I experience: https://github.com/dresden-elektronik/deconz-rest-plugin/releases

Why is this now a user question and not a bug report? Remotes should not just stop working. That's not a question.

@ChrisPrefect I've experienced the same issue but with the bulbs as well...

They just become unavailable all of a sudden and it's driving me insane. I thought this was a faulty device so I returned mine but I've come down to two problems:

  1. Later deconz versions (from 2.05.80 and onwards) are buggy and should not have been released as stable in my view.
  2. The range limits on the deconz device/routing capabilities of my network seems off. I can send commands to and from my ikea on/off remote but my ikea tradfri GU10 will not receive the commands from the mesh network, even though I'm only 4 meters away from another bulb (E27 Ikea in another room 4 m away from the GU10).

If these issues continues to the same extent I will not continue with Zigbee and move in to wifi bulbs. I'm glad I can investigate during day times otherwise the next 3 weekends would have been solely troubleshooting this...

Why is this now a user question and not a bug report? Remotes should not just stop working. That's not a question.

@ChrisPrefect Because if it was a bug, i had a lot more users complaining about this. There is a issue on Ikea stuff https://github.com/dresden-elektronik/deconz-rest-plugin/issues/1261 and that is in progress. So it might be that you are experiencing that.

@svippe : Can you open a own issue? Because there might be other stuff going on.

Wait, what?? Mimiix, you are saying that this is not a bug because I am the first to report it? SERIOUSLY??

This is not the first time you do this kind of stuff. You also try to fend of user problems in the discord channel, instead of helping. Why are you so hostile? Why are you defending deconz instead of recognizing all the many, many problems it has and try to get them fixed by the devs as soon as possible?

Marking comprehensive bug reports as "user question" really takes the cake...

And your only excuse is that no one else has reported it so far? Wow...

How many users have 100+ Tradfri nodes in a big area network with 5 Tradfri repeaters and dozens of remotes and use this over a period of several month? I bet very few users. And how many will report it here if a handful of remotes suddenly fail and have to be reset and bound again? No one will ever report this, except me. You should be thankful that I took the time to investigate and write a complete bug report with screenshots.

1261 has nothing to do with this bug. And did you check the date of this bug? 14 Feb 2019 (!!!!!) This issue is soon TWO YEARS OLD! And it is still nowhere near fixed! Svippe confirmed this with his report.

This is really stressful. So many hours and days are wasted in troubleshooting Deconz/Conbee/Phoscon. This system just does not work properly in a big Tradfri-Network. And no one is doing anything to fix this. Except trying to remove bug reports and hide them as "user comment". This is not helpful.

I will call dresden elektronik for the fourth time tomorrow and discuss this directly with them.

@ChrisPrefect Hey, that is not my intention here! I really am trying to help users where i can. I am not hostile, i simply try to keep things clean. Please accept my apologies, as this is not what i am intending at all.

In terms of the problems: I am forwarding the issues to @manup whenever i feel there is a need to. In my time here, i went trough all issues and hardly any issue is the same. I do agree there is a lot to be done and that's why i initially i picked up this position: I was frustrated. Just frustrated that deCONZ was a bit of a mess. I messaged Robban and asked if i could make the Discord more active. Reached out to Manup and fast forward , here i am. I cleaned the old issues and try to prioritze stuff that needs fixing. Ikea is one of the last. But hey, i can't change a mess of 2 years in simply 3 months.

Frankly, i know a few users , in person, with 100+ nodes in the network. They have no issues at all. So it is not as generic as you think it might be. There are over 65k installations of deCONZ. DE has their own lab. So yes, i am thankful for your report and you are trying everything you can. But there are so many factors. See the new bug on the Opple switches: I had a few reports on discord and here. After that, i can see they are related and confirm the bug. It's just a label, don't worry.

If you really feel that nobody is doing anything to fix this: Please see the discussion in #1261 . There's a few there and i am shaking the tree whenever i can. I'm happy to invite you over to start moderating with me and help users.

Hello Mimiix

OK, thank you for your explanation.

It is really frustrating and hopeless as a user. Since many, many months there are massive problems and nothing gets fixed so far.

How can we go forward? I REALLY want to pay for fixing this issues. I would spend 1'000€ if my Zigbee network would run stable (reliably react to button presses and Alexa-commands) and scenes would work properly. Can it be this hard?

You tell me that others run big Tradfri-Networks without issues. This gives me hope, but at the same time shows that there really needs to be some kind of paid support to find issues in my particular Zigbee network. If it works for others, it should be easily fixable without any code changes in deconz or phoscon, right?

Sometimes the lights work when I press the button on a Tradfri remote. Sometimes not. This should be debuggable somehow, right? But how? When I turn on a group of lights with Alexa, almost always only a few lights react. So turning on a group with Alexa or a remote somehow works completely different. Why? Who can support this? Is there a third party electrician/IT company that can support such an installation? Why doesn't Dresden Electronic offer paid support like most other open source companies?

What is the path for me to get usable lights in my home in the foreseeable future?

Thanks for your help!

Hi Chris,

In going forward, let's start with these steps. This will provide me some details:)

Are you able to give me the following information?

  • Full system specs of the host machine/it is running on.
    Host OS,
    RPI: What else is on there?
    What SD are you using?
    Running method (VM?Native? Docker? HA Addon? Hoops?)
    Any other WiFi stuff going around?
    Any other USB devices to the RPI?
    What PSU are you using for the RPI?

In addition, can you add:

  • Deconz Logs (Full) of 1 day of running. Using the full debug flags. Let's see if i can find some Errorcodes.

I also just noticed: You are on a older version of the Firmware of the conbee. Please use this guide: https://github.com/dresden-elektronik/deconz-rest-plugin/wiki/Update-deCONZ-manually and update to the latest versions. As far as i know, there have been fixes for ikea devices.

Alexa seems to be buggy, but i am not sure on that. Alexa uses the rest-api, and so does Phoscon. I've heard more users on this, but I have no clue on how it works with deCONZ. I've asked this to Manup. I'll come back to you on that.

Looking forward to your reply!

Hosts: Currently Raspberry Pi 4, before Raspi Pi 3B, but with same problems

Host: Currently downloaded beta deconz desktop image burned with etcher and updated manually. Before command line version, with same problems. Also tried on an Intel i5 Windows machine but with same problems.

Original Raspi 4 USB-C power supply. I even have a big fan on top of the Raspi 4 to cool it. It now only reaches 42°C

There is nothing else on the Raspberry. It's the original deconz image.

Tried different SDs Sandisk, Kingston, currently OV 32GB

No other USB devices on the Raspi. Conbee II is currently plugged in with a 2 m extension on the lower USB2.0 Port.

Raspi is connected with Ethernet to a Switch 30 cm away and that switch to a 24 port Unifi PoE Switch, connected to a Dream Machine Pro. I tried different USB extensions and USB ports and also bought a second Conbee II, but with the same issues.

There are 6 Unifi accesspoints around the house. I moved as many devices as possible to 5G8. I set the APs all to 2G4 WiFi channel 1. Not ideal for the WiFi, but farthest away from the Zigbee channel 25. I also tried unplugging the AP that is in the same closet as the Conbee II, but that did not change anything. There are no other WiFi around from neighbors.

There is however a separate Tradfri Zigbee network in the basement with its own Tradfri gateway. There are about 37 Tradfri lights (FLOALTS, GU10) and 8 remotes and 4 motion detectors on this network. I wanted to move all the devices from this network to the deconz-Network too, but I worry that this would make the deconz network even slower.

I updated deconz to 2.05.86 now. And Conbee II to 26660700. It's a problem that they are not numbered sequentially. I saw the "0700" at the end and thought that this was the same version I already have...

What command do I need to run to get the logs for a day?

I read in the other thread that deconz can't handle commands that are sent simultaneously and needs several seconds between commands. If I use Alexa to turn on a group with 10 GU10 spots, Alexa will send 10 commands simultaneously to switch on each light individually. This always fails and only 40-60% of the lights actually get turned on.

Here is a video of the Alexa problem (watch with sound): https://www.youtube.com/watch?v=GPDQK2BT19Y

This is a BIG bug in the deconz network. All commands need to be executed reliably, no matter how many commands are sent simultaneuously.

Sometimes it takes literally 15-30 seconds for a simple button press on a Tradfri remote to turn on a group of two lights. What happens during this time? Is deconz sending millions of commands every second till the lights turn on? Or is the signal from the remote too weak (needs to travel several hops) to reach the coordinator, but the remote only retries every few seconds or so?

This is a Phoscon bug, I also added a video: #3265
This issue is also annoying and you forwarded it 6 weeks ago. Was there any reaction or progress? #3135

Thanks for your help!!

Some (older) screenshots:

image

image

image

image

image

Thanks for the detailed report! I'll start with the alexa part before going into your message.

The alexa part:
I asked manup on the alexa integration, that one is very basic and thus explaining what you mean:

_The Alexa "support" is simply using the fact that the Echo build int integration for Philips Hue (but not the Hue Skill) is using the same API REST-API. Alexa sees deCONZ just as an Philips Hue gateway._

This is something i heard more often with users. I doubt if we can fix this, as alexa is sending the commands and deconz is acting on them. I think we should make some specific alexa integration on this to fix this. deCONZ can send a lot of commands at the same time: I used to have some groups in HA to control 5+ lights in deCONZ. That went well (maybe 1 second between the first and last light?). I think the bug is within Alexa, not deCONZ here. But, that could be logged too! Todo Dennis: I'll get this with Manuel. I can't give a ETA on this.

Hardware
The reason i asked for detailed hardware is 1: I wanted to see what happened on your systems, 2: the SD. In the past, i had a bad experience with a slow SD card which caused slow light transitions. I had only 5 - 8 devices in deconz, but the SD card slowed that madly like you experienced. Could you get me a serial/product ID of your SD? Just to make sure it is fast. Todo Chris: Get serial of SD/SD Details.

The WiFi is a bit busy, but i am mostly interested in the channel 25 of Zigbee. I have asked the devs if that could be a thing. I do know for a fact that Konke devices only like channel 15. Maybe that could be the case with ikea aswell. Worth a try in any case! The network in the basement _shouldn't_ be a issue. But that would be visible from the logs too. Todo Dennis: Get info if ch25 is a issue.

As you are running native linux, you can enable logs by adding the following flags to your running script :
--dbg-info=2 --dbg-aps=1 --dbg-error=2 > debug.txt

That would create a file called 'debug.txt' in the place the deconz script is started. For more info on this, please check this page Todo Chris: Enable logs and provide logs

On the 15 - 30 seconds: I have no clue at all. That should not happen. However, i think it could be related to anything of the above, thus lets wait for the logs and see from there. The logs should provide error codes which we can use to debug.

On this one: https://github.com/dresden-elektronik/deconz-rest-plugin/issues/3135 I missed the ball. Sorry, been busy lately:( I'll check with manup for a reply from him. Todo Dennis: Get Manup to reply to 3135

Let's see what happens from here :)

For Zigbee channel 25 its one of the primary Light link channels (11,15, 20 and 25) = all LL certified devices like old IKEA suld and is working on it and its also one demand in Zigbee 3 certification = all new IKEA devices.
Have 3 Zigbee network running on ch 15 (deCONZ), 20 (IKEA) and 25 (ZHA) without problems with IKEA devices (Wifi is blocking ch 11 for my) but not so extended as Chris.
Konke have only one device Zigbee certified (one new GW / HUB) = the old devices is not one real zigbee device.

Thanks for the input.

There is a difference if deCONZ sends a single command to a group or Alexa sends one command per light in a group. Alexa has no concept of Zigbee groups. It sees each light individually. You can group them together in the Alexa app, but that just means that Alexa will send the same command to all of these individual lights in this "virtual" group. But that fails with deCONZ because Alexa then sends 10-15 commands at the exact same time to the API and deCONZ can't handle this and the commands only reach a few of the lights and others are just lost.

I will clone the SD card to another card and see if that helps. But I already used several different cards with the RPI3 and RPI4 and when switching from deCONZ headless to the desktop/VNC image. How can I get the serial info from the card? I don't normally use Linux, sorry :-)
I found this in the device file /sys/block/mmcblk0/device/cid : 834e434e4361726402b9f19a8900f700

I also did a scan of the Wi-Fi / Zigbee band with an RF explorer and I could not see any persistent signals. The noise floor was at the same level as before and after the Wi-Fi band. When I pressed the buttons of a remote, I could see spikes around Zigbee channel 25. So the signal-to-noise ration is at least not zero and there are no other signals on this frequency.

I will try to enable logging, but I had no luck so far. Where exactly do I need to set the logging options?

image

Here is some small bug I just experienced when trying to pair a new Aqara temperature sensor:

Phoscon App - Google Chrome 2020-10-22 00-26-22.zip

I will clone the SD card to another card and see if that helps. But I already used several different cards with the RPI3 and RPI4 and when switching from deCONZ headless to the desktop/VNC image. How can I get the serial info from the card? I don't normally use Linux, sorry :-)

It's on the SD itself. I think on the back. But it should also say something about the class.

I will try to enable logging, but I had no luck so far. Where exactly do I need to set the logging options?

You need to edit the service file afaik. The ExecStart line. I am not too familiar with that myself. https://www.linode.com/docs/guides/introduction-to-systemctl/ refer to that. _I'm learning things too here_ 😄

I found out how to edit the service script and add the logging. But I don't see any log files being written. Is this normal?

Shouldn't there be a simple and clear how-to article available from Dresden Elektronik on how to provide these logs? It seems quite essential to debug bugs that customers report.

image

@ChrisPrefect I think manup last finding can being one of the reasons for your problems https://github.com/dresden-elektronik/deconz-rest-plugin/issues/1261#issuecomment-716002473 ;-(((

Interesting bug, and I think I also ran into this one maybe 6 times during the last year. But power cycling the light does fix it. Yes, it is very annoying if this happens, but at least there is a clear fix for it and it happens rarely.

But not being able to switch a group of lights reliably with Alexa every time is more annoying. Or the broken scenes and color temperature control with Tradfri FLOALTs. There are so many other bugs.

Also, Phoscon seems not to be actively developed anymore? Where are the on/off buttons for groups and individual lights? Moving the slider to turn them on surely is just a first draft, right?

I wondering if the new firmware with source routing is working well with the IKEA bulbs or the coordinator is losing the routers / end devices and how its handling the routering then one one end device is changing its parent.
ZHA have doing great work with the routing and tracking where the end devices is coming from so not need doing broadcast finding them if they don't awaring then sleeping (With EZSP NCP).

I finding how Alexa is switching "groups of light" very strange. Not using zigbee light groupes (that is one broadcast / command) and sending individual commands to etch devices (1 command X devices) is not how you is working with lights in the zigbee world and abusing the system resources.

Then working with lights i using the "old web app" then is only using real light grupes and doing real zigbee binding that working that deCONZ is offline (Updating or have killing the SD card).
Phoscon is one big mess for implanting HA sensors and switches that is not working with binding lights (offline = not working) so i have moving all my sensors to ZHA with EZSP coordinator for eliminating the "Xiaomi problems" and only using "nice" routers for avoiding problems.

One question: Then one remote is offline in deCONZ is it sending group commands to the bulbs and is the bulbs reacting of the command ?
Its interesting then group commands is broadcast and and not being routed routed thru the coordinator. Unicast (direct commands to one devices) is being routed thru the mesh and is the mesh having problems its failing then not finding the right route to the device.

Alexa can't use Zigbee groups because you can have one light in many different groups.

And it should really not be a problem to send two commands in total to a group with two lights. But even this fails regularly with Alexa. deconz somehow can't handle sending out commands in parallel.

Then Alexa is little stone age equipment in the Xrocket age.
And Philips HUE is steering the lights around after wath is displaying on the TV (with no zigbee standard way for getting the speed) and the mesh is not collapsing / blocking and still having pretty good compatibility with other certified devices.

My experience is the that much of the problem is blocking / unsynced communication that is making pairing and other thing failing more than often (I don't believe that the mesh is blocking reading all readable attribute in one bulbs basic cluster that is 2 meter away (normally the sw version is not being red) and 20 seconds later i can reading the missing attribute manually and getting the response, i have testing it more times and it's the same).

Zigbee have one very bad limiting thing and its that the underlying network layer (IEEE 802.15.4) is limiting broadcast to around 1.2 / second. Unicast is not limited (only the "wire speed") and should not being any problems (like OTA file transfer). Then looking with wireshark is IKEAs GW is mostly using broadcast between all devices in the mesh (Lighting standard) and its working also then doing OTA updating of devices.

Power on / off 2 lights is zero load for the mesh.
One Zigpy dev have spamming the mesh by toggling 30 lights / second without killing the network (with one new TI coordinator).
In ZHA i am using one IKEA ICC-1-A modul (its used of nearly all ikea devices) as coordinator and its have the same or better HW and RF performance as second gen Rasp/CornBee and is using sirlabs standard EZSP firmware (the coordinator firmware for the SOC) and one 7€ IKEA bulb have the same performance in the mesh if not the application is crashing in the firmware.

I found out how to edit the service script and add the logging. But I don't see any log files being written. Is this normal?

Shouldn't there be a simple and clear how-to article available from Dresden Elektronik on how to provide these logs? It seems quite essential to debug bugs that customers report.

image

I agree on that simple how-to article. I think you need to add the file to the string. Here is a community created explanation link That logs are getting easier: There's going to be a event window implemented next beta.

It does not say how and where to add these parameters :-( How do I set up logging? Thanks!

@SwoopX Are you able to explain a bit here?

@Mimiix Erm, explain what? Copy&Paste?

See here: https://github.com/dresden-elektronik/deconz-rest-plugin/issues/3406#issuecomment-713916340 The issue in the screenshot was not using the full path to the command + using the wrong command name. /usr/bin/deCONZ

Uhm, what? I don’t understand.

Are you referring to https://github.com/dresden-elektronik/deconz-rest-plugin/issues/3406#issuecomment-716073261 ?

The command and path where not changed. This is the original file as found in the deconz gui image.

Only the logging parameters where added.

But I don’t get any logfiles.

How do we enable logging, so that we can help you guys debugging?

sudo systemctl stop deconz
sudo systemctl stop deconz-gui
/usr/bin/deCONZ --dbg-info=2 --dbg-error=2 > debug.txt

Doesn't work :-(
image

So, no gui...
/usr/bin/deCONZ -platform minimal --dbg-info=2 --dbg-aps=1 --dbg-error=2 > debug.txt

No, I use the GUI image.

Only "sudo systemctl stop deconz-gui" really stopps the service.

The new command doesn't work either:
image

Why do you think that? It's doing alright as far as I can tell...

Is there any update on this?

I just lost again four Xiaomi Aqara Sensors. They where paired, but now one is not responding, two have lost their names and defaulted back to just "Temperatursensor" or "Vibrationssensor" and one is completely missing from the sensors list.

Can it be that these sensors don't update the path in the mesh? When I pair them near the Conbee2 stick they work, but when I bring them to their dedicated room, which has a Tradfri repeater nearby, they are no longer connected.

I found another bug in the phoscon interface. It won't let me set the sensor to the name it had once, even tough it now has again its default name. I can't see why the old name is still blocked. No device shows the old name.

Video: Phoscon App - Google Chrome 2020-11-09 22-47-04.zip

@ChrisPrefect Mapnu have doing some commitments that hopefully is coming in the next beta. Hi trying getting down the reporting rate for groups and other unnecessary status that is broadcasted to groups and may cause broadcast storms and blocking the mesh. https://github.com/dresden-elektronik/deconz-rest-plugin/pull/3614#issue-517376635 https://github.com/dresden-elektronik/deconz-rest-plugin/pull/3615#issue-517377972
I hope hi is managen getting it right then getting broadcast storms the underlying networks is going in blocking state and being irresponsible.

@ChrisPrefect I am waiting on your logs 😅

But as we are close: On the next Beta there should be a dedicated Log window within deconz. Easier to get logs by then.

Facing the same issue. My remotes used to show up as the "Basement Floor Lamp" in the screenshot, this way I could directly bind the remote to lights. Now they don't show up anymore. Using Deconz as an add-on in Home Assitant.
image

@andordavoti You don't have the same issue, i'm for sure 😅 . To fix yours, use the old WebUI. On discord it shows in the #faq.

Was this page helpful?
0 / 5 - 0 ratings