Home Assistant release (hass --version
):
0.55.0
Python release (python3 --version
):
Python 3.5.3
Component/platform:
pytradfri
Description of problem:
Upgrading to 0.55 with new async code results in ValueError on tradfri light init. Further, after HASS has been running for a while status is not shown in HASS (all marked as off).
Expected:
No ValueError and possibility to see tr氓dfri lights status without restarting.
Problem-relevant configuration.yaml
entries and steps to reproduce:
tradfri:
host: 10.0.2.252
api_key: !secret tradfri_api_key
allow_tradfri_groups: false
Traceback (if applicable):
2017-10-11 17:41:50 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
result = coro.send(None)
File "/home/homeassistant/hass/lib/python3.5/site-packages/homeassistant/helpers/entity_component.py", line 388, in async_add_entities
yield from asyncio.wait(tasks, loop=self.component.hass.loop)
File "/usr/lib/python3.5/asyncio/tasks.py", line 346, in wait
raise ValueError('Set of coroutines/Futures is empty.')
ValueError: Set of coroutines/Futures is empty.
After HASS has been running I get the following error message:
2017-10-11 20:40:26 ERROR (MainThread) [homeassistant.core] Error doing job: Fatal read error on socket transport
Traceback (most recent call last):
File "/usr/lib/python3.5/asyncio/selector_events.py", line 723, in _read_ready
data = self._sock.recv(self.max_size)
OSError: [Errno 113] No route to host
In those cases it is not possible to see status of Tr氓dfri lights (all are marked as off).
Additional info:
I have tried to find any clues in the debug output (logger: default: debug) but with no success. However, I have a limited understanding of Python and HASS. I might therefore have missed something.
Since 0.55 I'm also experiencing that after a few hours of running Home Assistant the lights status is not reported correctly. I can still turn them on and set their brightness, but their state will remain off. As a side effect the Flux Switch (and various automations) don't work.
I think fixing the init problem should be solved by replacing line 388 in homeassistant/helpers/entity_component.py (at least I get no error messages and it passes testing)
yield from asyncio.wait(tasks, loop=self.component.hass.loop)
with
if tasks:
yield from asyncio.wait(tasks, loop=self.component.hass.loop)
However, I am still trying to find the calling function for the second error.
I should mention, that I do not get those errors in my log, but I am seeing the same behaviour where the light state does not update. Maybe this is a separate issue. In my case, I can also get lights with their state stuck to 'on'.
The 'stuck state' issue seems to persist with the pytradfri update on the dev branch https://github.com/home-assistant/home-assistant/commit/d16c5f904668a035995d985a3ec372320d008459
Same problem here, after a few hours I'm unable to control my tradfri with homeassistant, but it works with tradfri apps or the tradfri remote. To solve the problem I need to restart homeassistant.
Persists in HASS 0.57.0 with pytradfri 4.0.1.
The problem is still here with hass 0.57.1
So, I get a slightly different error message after HASS 0.57 and pytradfri 4.0.x. I get the initial error message as stated before but not the "Error doing job: Fatal read error on socket transport" that I got after a few hours, instead I get the following directly after the first initial error message:
ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File "/usr/lib/python3.5/asyncio/tasks.py", line 239, in _step
result = coro.send(None)
File "/srv/homeassistant/lib/python3.5/site-packages/pytradfri/api/aiocoap_api.py", line 149, in request
result = yield from self._execute(api_commands)
File "/srv/homeassistant/lib/python3.5/site-packages/pytradfri/api/aiocoap_api.py", line 107, in _execute
yield from self._observe(api_command)
File "/srv/homeassistant/lib/python3.5/site-packages/pytradfri/api/aiocoap_api.py", line 169, in _observe
api_command.result = _process_output(r)
File "/srv/homeassistant/lib/python3.5/site-packages/pytradfri/command.py", line 71, in result
self._result = self._process_result(value)
File "/srv/homeassistant/lib/python3.5/site-packages/pytradfri/resource.py", line 46, in observe_callback
callback(self)
File "/srv/homeassistant/lib/python3.5/site-packages/homeassistant/components/light/tradfri.py", line 318, in _observe_update
self._light_data.hex_color_inferred
File "/srv/homeassistant/lib/python3.5/site-packages/pytradfri/device.py", line 281, in hex_color_inferred
*xy_brightness_to_rgb(scale(x), scale(y), self.dimmer)
File "/srv/homeassistant/lib/python3.5/site-packages/pytradfri/color.py", line 142, in xy_brightness_to_rgb
brightness = ibrightness / 255.
TypeError: unsupported operand type(s) for /: 'NoneType' and 'float'
If it matters I have three TR脜DFRI-lamps and one Hue. Overall, HASS and tradfri works even more poorly now. When I (re)start HASS the status is off (bulbs that are on are stated as off and vice versa) and sometimes I cannot change state.
I tried some debug (but I'm not an expert). When tradfri is working I have something like these in the logs:
2017-11-07 12:35:02 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Status: 2.04 Changed, Received:
2017-11-07 12:35:02 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Status: 2.04 Changed, Received:
2017-11-07 12:35:15 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Executing 192.168.20.15 put ['15001', 65540]: {'3311': [{'5850': 0}]}
2017-11-07 12:35:15 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Status: 2.04 Changed, Received:
2017-11-07 12:35:15 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Status: 2.05 Content, Received: {"3":{"0":"IKEA of Sweden","1":"TRADFRI bulb E27 WS opal 980lm","2":"","3":"1.2.217","6":1},"9001":"sala","9002":1493665166,"9020":1509983471,"9003":65540,"3311":[{"5850":0,"5851":254,"5711":370,"5709":30138,"5710":26909,"5706":"f1e0b5","9003":0}],"9054":0,"5750":2,"9019":1}
2017-11-07 12:35:16 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Executing 192.168.20.15 put ['15001', 65540]: {'3311': [{'5850': 0}]}
2017-11-07 12:35:16 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Status: 2.04 Changed, Received:
2017-11-07 12:35:16 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Executing 192.168.20.15 put ['15001', 65540]: {'3311': [{'5850': 0}]}
when tradfri is NOT working I don't have the "Content, Received" line any more:
2017-11-07 12:25:39 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Executing 192.168.20.15 put ['15001', 65540]: {'3311': [{'5850': 1}]}
2017-11-07 12:25:39 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Status: 2.04 Changed, Received:
2017-11-07 12:25:39 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Executing 192.168.20.15 put ['15001', 65540]: {'3311': [{'5850': 1}]}
2017-11-07 12:25:39 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Status: 2.04 Changed, Received:
2017-11-07 12:25:39 DEBUG (MainThread) [pytradfri.api.aiocoap_api] Executing 192.168.20.15 put ['15001', 65540]: {'3311': [{'5850': 1}]}
I hope this is helpful to resolve the problem
if I restart home assistant the problem go away and everything works
I get this aswell, however not using tr氓dfri. I am using rflink for lights, but get the same error in logs and not able to control lamps after error appears in logs...
I solved the problem with an upgrade of python from 3.5 to 3.6 (I needed to install with pip cython and DTLSSocket)
EDIT: the problem is still here, it takes only a lot of more time (more than 24 hours instead of 2 or 3) to recur
@mvivaldi, I also upgraded as I did not see the edit, the issue persists. I get the problem after four hours now.
I get the following error on init:
ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 180, in _step
result = coro.send(None)
File "/srv/homeassistant/lib/python3.6/site-packages/homeassistant/helpers/entity_component.py", line 405, in async_add_entities
yield from asyncio.wait(tasks, loop=self.component.hass.loop)
File "/usr/local/lib/python3.6/asyncio/tasks.py", line 304, in wait
raise ValueError('Set of coroutines/Futures is empty.')
ValueError: Set of coroutines/Futures is empty.
and then the error message I had in the beginning is back (after 4 hours or so)
ERROR (MainThread) [homeassistant.core] Error doing job: Fatal read error on socket transport
Traceback (most recent call last):
File "/usr/local/lib/python3.6/asyncio/selector_events.py", line 724, in _read_ready
data = self._sock.recv(self.max_size)
OSError: [Errno 113] No route to host
I don't have any error in the logfile, I don't know what to do anymore, I restart homessistant every few hours via cron to "solve" the problem
Solved (I hope!). The problem was my home assistant and my tradfri weren't in the same network (no firewall involved only a router). Now they are in the same network and everything is working.
I've looked into this and the following workaround has been running stably for me the last few days.
This is for the issue as I have described it above: After some time the states of tradfri lights do not update anymore, there's no error in the log.
_async_start_observe
in components/light/tradfri.py
dispatches an observe
with duation=0
.
cmd = self._group.observe(callback=self._observe_update,
err_callback=self._async_start_observe,
duration=0)
self.hass.async_add_job(self._api(cmd))
I replaced this with a 2 minute duration, and queueing up a repeated call to _async_start_observe
after 119 seconds, like this:
cmd = self._light.observe(callback=self._observe_update,
err_callback=self._async_start_observe,
duration=120)
self.hass.async_add_job(self._api(cmd))
self.hass.loop.call_later(119, self._async_start_observe)
I would probably get away with a longer interval than 2 minutes.
My suspicion is that the gateway does not handle infinite observation durations well, after all this feature is probably only meant to update the view of the IKEA app while it is displayed on the phone.
I am not very familiar with this code base, and I don't know how observation is handled on pytradfri's side.
So this code might well be creating more and more connections, which are never closed.
I would appreciate if someone could look over this.
Update: It seems I'm getting slowly but constantly growing CPU and memory usage, I'll try to verify if it is related to this change.
Same problem here on hass.io but not on hassbian. In hass.io tr氓dfi gets unresponsive in approximate 2 days of uptime.
Any fix for this?
@ggravlingen @lwis Does one of you have any insight into what's going on here?
It's no longer clear from this thread what the problem is, if it's in relation to the unlimited observation; I have a connection to my Gateway open 24/7 without issues.
The problem that I'm referring to is that the light's state stops updating after a few hours of operation. Also my workaround alleviates this issue, but I think it's a pretty crude hackfix.
The problem is that if the gateway just stops emitting events without closing the socket there's not much we can to do correct. As you mentioned, I'm hesitant about putting in any workarounds for sporadic memory leaks on the hardware. Do have a busy setup with many lights frequently changing?
I wouldn't call it busy. Somewhere between 10 and 20 changes total a day.
I've the same issue on hass 0.69.1.
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment :+1:
I am still experiencing the issue on 0.74, and my workaround still works.
However this issue is a mess of different problems with slightly different symptoms.
My problem, the states stop updating after some time without any log output, seems to be the same as #14386, so I think it should be tracked there.
If nobody experiences the error as described by @comra, I suggest closing this issue.
Well, my problem persists but a few versions ago the error stopped showing up in the log. So, we can close it as it is not the same anymore.
@max-te I'm having the same issue here. Your workaround should be working great, but I guess that a reset once every 2 minutes is quite alot. I would suggest to extend that duration to something like an hour. Which would be more stable on system load, but also would be enough to reset the event system when something goes wrong.
I mostly experience this issue and/or #14386 when I switch mulitple lights or switches at once. And it is quite annoying having to restart HASS everytime I'm going to sleep.
Should be fixed by https://github.com/home-assistant/home-assistant/pull/18708
@cgarwood #18708 was not merged and unfortunately does not fix this issue completely. Can you re-open this issue?
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment :+1:
@max-te, did ypu find out if the fix you wrote, relate to the increased resource use you reported?
There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 馃憤
This issue now has been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.
Most helpful comment
I've looked into this and the following workaround has been running stably for me the last few days.
This is for the issue as I have described it above: After some time the states of tradfri lights do not update anymore, there's no error in the log.
_async_start_observe
incomponents/light/tradfri.py
dispatches anobserve
withduation=0
.I replaced this with a 2 minute duration, and queueing up a repeated call to
_async_start_observe
after 119 seconds, like this:I would probably get away with a longer interval than 2 minutes.
My suspicion is that the gateway does not handle infinite observation durations well, after all this feature is probably only meant to update the view of the IKEA app while it is displayed on the phone.
I am not very familiar with this code base, and I don't know how observation is handled on pytradfri's side.
So this code might well be creating more and more connections, which are never closed.
I would appreciate if someone could look over this.
Update: It seems I'm getting slowly but constantly growing CPU and memory usage, I'll try to verify if it is related to this change.