Core: Monitoring Ubiquiti network devices with UniFi integration unstable since 0.110.x

Created on 23 May 2020  路  15Comments  路  Source: home-assistant/core

The problem

Since upgrading to Home Assistant 0.110, the UniFi device tracker integration reports Ubiquiti network devices going away regularly.

Environment

  • Home Assistant Core release with the issue: 0.110.1
  • Last working Home Assistant Core release (if known): 0.109.6
  • Operating environment (Home Assistant/Supervised/Docker/venv): Docker
  • Integration causing this issue: UniFi
  • Link to integration documentation on our website: https://www.home-assistant.io/integrations/unifi/

Problem-relevant configuration.yaml

none

Traceback/Error logs

no errors in the log

Additional information

Here is an illustration of what's been happening the last few days. The green vertical line indicates (by approximation) when I upgraded HA:

Schermafbeelding 2020-05-23 om 08 07 38 copy

The 6 Ubiquiti network devices go away and return home frequently, irregularly and independently; while 2 Raspberry Pi's, 2 smartphones and a tablet show no such issues.

I believe the issue is with the integration itself, and not the controller or the devices:

  • The UniFi controller itself does not report any disconnects on any of the 6 network devices (no alerts or events).
  • The wired and wireless devices connected to the 6 network devices do not experience any actual connection issues (nor does HA report them as going away when they are home).

Good to know:

  • I'm running my UniFi controller as a Docker container using this image on the same host as my HA container.
  • The UniFi controller was updated to the latest version (5.12.72) when I updated HA from 0.109.6 to 0.110.1.
  • Updating the firmware of all network devices (USG, 4 switches and AP) did not resolve the issue.
unifi

Most helpful comment

i have the same issue with Unify's Controller Add-on

All 15 comments

Hey there @Kane610, mind taking a look at this issue as its been labeled with a integration (unifi) you are listed as a codeowner for? Thanks!
(message by CodeOwnersMention)

Enable debugging and share logs please

Working on that now with the following config:

logger:
  default: critical
  logs:
    aiounifi: debug
    homeassistant.components.unifi: debug
    homeassistant.components.device_tracker.unifi: debug
    homeassistant.components.switch.unifi: debug

Any suggestions on how to share the logs? This is generating A LOT of lines. Or am I looking for something specific?

UPDATE: I collected logs (14k+ lines) for about an hour and 15 minutes during which the following events occured:

  • 01:26: device_tracker.router went away, and came home 4 seconds later
  • 01:40: device_tracker.router went away, and came home 9 seconds later
  • 01:45: device_tracker.router went away, and came home 10 seconds later
  • 01:58: device_tracker.router went away, and came home 5 seconds later
  • 02:16: device_tracker.router went away, and came home 7 seconds later
  • 02:22: device_tracker.switch_livingroom went away, and came home 6 seconds later
  • 02:24: device_tracker.router went away, and came home 4 seconds later

At 02:24, I had a ping going from my NUC (my Docker host for HA, UniFi and a bunch of other services) to my router. Not a single packet was lost:

158 packets transmitted, 158 received, 0% packet loss, time 749ms
rtt min/avg/max/mdev = 0.268/0.406/1.227/0.134 ms

My UniFi controller also has no alerts or events for any of these "disconnections". In short: HA sees my Ubiquiti network devices disconnect frequently while no actual network disruptions take place, and the UniFi controller is not reporting any problems. Any device connected to my Ubiquiti network and tracked by HA is not affected.

I have the exact same issues! Some things of note for me which are maybe slightly different.

It is _only_ the APs that are flapping home/not_home. I have 4. It started as soon as I upgraded to 110.1 (from 109) and still does it on 110.2.

I was able to 'resolve' it the first time, by restarting the controller on my server. I have a FreeBSD server that runs the controller software (same network, etc. All very close) which is independent of HASS. The server is perfectly fine, no issue with it. Everything runs as expected. When I check the UI, they are permanently available.

After another restart of HASS, only 2 APs were doing it.... I then upgraded to 110.2, so another restart and now only one AP is doing it, so it is very much all over the place.

edit: just a note that it seems FreeBSD12 only has the Unifi Controller version 5.12.66.0 currently available.

Same issue here, running 0.110.2 supervised on debian, the controller is 5.12.72 (Build: atag_5.12.72_13103) and running on a Cloud Key Gen2 Plus (which shows no signs of problems).
5 switches and 2 access points flapping

Only happens to Ubiquiti equipment, all clients reporting normal
Last known working was 0.109.6

The logic on home/away setting of UniFi devices is that on a message UniFi describes when in time to expect next message. The Integration then adds 10 seconds on top of that to justify for any possible system load, but it appears to not be enough.

Could you guys enable debug logging and verify that the time strings that this is what's happening, maybe I should change it to 30 seconds or something

https://github.com/home-assistant/core/blob/59fe5458d0466c4d6de8ea7f94e6a668690c8f8f/homeassistant/components/unifi/device_tracker.py#L295-L305

The debug log (with logger settings from fanaticDavid above) when this happened:

May 25, 2020
11:25:16 AM switch_closet is at home
11:25:11 AM switch_closet is away

debug2.log

The logic on home/away setting of UniFi devices is that on a message UniFi describes when in time to expect next message. The Integration then adds 10 seconds on top of that to justify for any possible system load, but it appears to not be enough.

Could you guys enable debug logging and verify that the time strings that this is what's happening, maybe I should change it to 30 seconds or something

https://github.com/home-assistant/core/blob/59fe5458d0466c4d6de8ea7f94e6a668690c8f8f/homeassistant/components/unifi/device_tracker.py#L295-L305

Can you make it configurable? I wish I saw this 10 minutes earlier, I would of updated the file for you and tested. :0 Maybe tomorrow. Thanks

i have the same issue with Unify's Controller Add-on

I changed line 304 to add 30 seconds instead of 10, and then I restarted my docker container. In an hour or so I should be able to tell the difference, if any.

UPDATE: This is what happened during about 2 hours after making the change:

  • 01:53: device_tracker.switch_office went away, and came back 9 seconds later
  • 01:53: device_tracker.switch_utilityroom went away, and came back 3 seconds later
  • 02:55: device_tracker.switch_utilityroom went away and came back within the same second

So it doesn't get rid of the problem entirely, eventhough I expected it to. However, it is still a significant improvement over the old situation. The entity for my USG device_tracker.router hasn't flapped even once in 2+ hours, and that was the network device that was affected the worst by far before making the change.

it resolved it for me but I wonder at the implications.

Upping it to 30 (which I did about 25 hours ago) has definitely improved the situation for me, but it has not resolved it. All of my network devices still flap away/home at least a few times a day. Before HA 0.110.x, it was rock solid.

yeah, I hear you. It doesn't feel like a fix but a work around.

Its strange, but in my case, since yesterday at around 9pm CST it has become stable with the 2 items i am monitoring...

In the same boat here on 0.110.3 with 5 switches, 2 APs and a USG all bouncing between home and not_home. I'm seeing at least one device drop and return every minute or so.
All are assigned fixed IPs outside DHCP range and no events or interruptions showing in the controller logs.
Currently trying the code modification and a controller restart.
Edit: looks like making the code change has decreased the frequency of bouncing slightly but still occurring every 3 to 4 minutes.

Was this page helpful?
0 / 5 - 0 ratings