I am still having the Cert Expiry sensor come up in an unavailable state on every restart. If I leave it alone it resets itself with the correct value in 24 hours or so. I thought this was fixed and I don't see any open issue for it currently, so not sure what I'm doing differently.
configuration.yaml
N/A, installed via integrations UI
2020-02-18 16:42:34 ERROR (MainThread) [homeassistant.helpers.entity] Update for sensor.ssl_certificate_expiry fails
Traceback (most recent call last):
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 279, in async_update_ha_state
await self.async_device_update()
File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 461, in async_device_update
await self.hass.async_add_executor_job(self.update)
File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/src/homeassistant/homeassistant/components/cert_expiry/sensor.py", line 122, in update
cert = get_cert(self.server_name, self.server_port)
File "/usr/src/homeassistant/homeassistant/components/cert_expiry/helper.py", line 12, in get_cert
with socket.create_connection(address, timeout=TIMEOUT) as sock:
File "/usr/local/lib/python3.7/socket.py", line 728, in create_connection
raise err
File "/usr/local/lib/python3.7/socket.py", line 716, in create_connection
sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused
My guess is that the certificate expiry sensor is trying to update before HA is ready to access the domain. If I remove and reinstall the integration after HA is up and running, it works fine... until the next restart. It is consistent and happens on EVERY restart of HA.
Hey there @Cereal2nd, @jjlawren, mind taking a look at this issue as its been labeled with a integration (cert_expiry
) you are listed as a codeowner for? Thanks!
I'm having the same problem while trying to get the expiration date from my home assistant website.
I did some digging, and, in my case, when the cert_expiry component is initialized, the home assistant website is not yet up. That's why I get "Connection refused". There is literally no server listening to the request.
I tweaked the update method of the sensor.py to ignore the first error. It's really ugly but since I know my configuration is valid... it works... Starting at line 133 :
except (ssl.CertificateError, ssl.SSLError):
self._available = True
self._state = 0
self._valid = True
return
Do you know if it's possible to configure the start of cert_expiry to be after the start of the HA website ?
@phlet That's the use case that https://github.com/home-assistant/home-assistant/pull/27137 intended to address. What version are you running?
I am on the latest released version.
Home Assistant 0.105.5
Edit:
System health
arch | armv7l
-- | --
dev | false
docker | false
hassio | false
os_name | Linux
python_version | 3.7.3
version | 0.105.5
virtualenv | true
@jjlawren That's what I thought, however it's still happening and I guess my belief that it was initializing before HA was serving is correct. But it seems to be limited or else there would be plenty of issues and "me toos" here. So I wonder if it only occurs on certain set ups, like VM installs for instance.
Here's my details to compare to @phlet 's:
arch | x86_64
-- | --
dev | false
docker | true
hassio | true
os_name | Linux
python_version | 3.7.6
version | 0.105.5
virtualenv | false
@phlet can you clarify what you mean by "my home assistant website"? Is this just its own HTTP UI? Or is there also a proxy or such in play?
@rpitera is your issue also for checking the cert on your HA instance or an external website?
It's just its own http ui. There's no proxy or such.
With the bybass I added it's working..
If I remove the configuration and I add it with integration UI it will work. But as soon as home assistant core is restarted, it doesn't work anymore.
Got it. The event we wait for to determine when the HTTP service is ready must not be waiting long enough. I'll need to have a think.
It's just its own http ui. There's no proxy or such
Same here.
If I remove the configuration and I add it with integration UI it will work. But as soon as home assistant core is restarted, it doesn't work anymore.
Also the same, although if I leave it alone it will update overnight (I haven't checked but I assume it does so at 24:00). @phlet Did it update if you left it unaltered?
Ok looking at the code between this integration and the HTTP server, there's definitely a race as they both try to init at the same time.
If you need, I'm available to test any patch or theory.
I patched my system with your 2 commits. There's seems to have a small glitch.
First call, I get this message (that's alright!) :
2020-02-19 19:23:11 ERROR (MainThread) [homeassistant.components.cert_expiry.sensor] Connection refused by server: mydomain.com, will retry in 15s
There is a second call 15 seconds later and there's no error. Looks good !
Unfortunately, the state value stay "unavailable" instead of having the number of day before the expiration.
I added another domain in my configuration file (google.com). From what I am seeing, the first exception we are getting seem to block the sensor registration that I get with google but not with mydomain.com :
2020-02-19 19:29:55 INFO (MainThread) [homeassistant.components.sensor] Setting up sensor.cert_expiry
2020-02-19 19:29:55 INFO (MainThread) [homeassistant.helpers.entity_registry] Registered new sensor.cert_expiry entity: sensor.expiration_google
@phlet in this example, "mydomain.com" points back to HA?
yes, mydomain.com is the domain where I have HA installed.
I've got a really, really simple fix... but I don't know if we can really do that. I made the modification on my system and it work...
The fix : To simply add a new event in core.py, EVENT_HOMEASSISTANT_RUNNING, that we fire right after the line
self.state = CoreState.running
With that done, in the sensor.py script, we change every instance of EVENT_HOMEASSISTANT_START for EVENT_HOMEASSISTANT_RUNNING.
The problem that I see : we need to change the files core.py and const.py of homeassistant. Can we do that ?
Yes, but it's a pretty invasive change. I'm trying to not change core or other integrations to fix the behavior of this one.
I made a silly mistake earlier, just pushed a small commit to the linked PR. It was updating the entity during the retry, but not actually changing its state as seen by HA.
I'll try that right now.
It worked !
I even saw the state transition between "unavailable" and the number of day.
Good job !
I've had the same problem since I installed HA, probably version .87 or somewhere around there. I always assumed it just never worked and never bothered to report it. It usually ends up coming back after a day or so, I believe when it tries to update again overnight.
Running in a VM on esxi.
I took a second shot at overhauling cert_expiry
in https://github.com/home-assistant/home-assistant/pull/32066. @phlet if you're willing to give that one a try instead of the previous PR, please let me know how it works.
I'll try it tonight.
It's working and the logs are better than before. However, the behaviour seems weird..
(forget this point, I saw your answer in the pull request)
_As I wrote in the pull request, we lost the name we give to the sensor. It's now always sensor.cert_expiry_domain instead of sensor._GIVEN_NAME_._
(forget this point, I saw your answer in the pull request, I understand it's in the device registry, I'll go read more on that...)
_When I remove one of the sensor cert_expiry from my configuration.yaml and restart HA, it is still there after the reboot. Normal ? I don't know if it was like that before._
With the cert_expiry sensor that fail on startup, if I add a new cert_expiry on another site (let's say microsoft.com), it'll take about 30 second to 1 minutes after the system is fully running before I see it in HA. All the existing cert_expiry sensor are there at the start, except the new one that take time. Again, I don't know if it was like that before, I may be testing a bit more than at first...
It's working and the logs are better than before. However, the behaviour seems weird..
- As I wrote in the pull request, we lost the name we give to the sensor. It's now always sensor.cert_expiry_domain instead of sensor._GIVEN_NAME_.
That's intended.
- When I remove one of the sensor cert_expiry from my configuration.yaml and restart HA, it is still there after the reboot. Normal ? I don't know if it was like that before.
That behavior is the same. The YAML config is used to import configs into the Integrations page.
- With the cert_expiry sensor that fail on startup, if I add a new cert_expiry on another site (let's say microsoft.com), it'll take about 30 second to 1 minutes after the system is fully running before I see it in HA. All the existing cert_expiry sensor are there at the start, except the new one that take time. Again, I don't know if it was like that before, I may be testing a bit more than at first...
Brand new configs in the YAML are delayed by 30s on first import to ensure there aren't race conditions on boot (like checking HA's own cert). This delay can probably be safely reduced to 10s and still be reliable for all users. Creating via the Integrations page is always immediate.
First time I hear about the entity registry... I just read a bit about it and it's really nice. Sorry for my ignorance....
Knowing that, everything work as intended here. The log are really better than the last one.
Great job !
No, it was a good question. It's hard to keep up with all the changes.
It's really nice to have core support for things like renaming where each individual integration doesn't have manage it anymore. Even so, it's a change that has a tendency to make users uncomfortable if they're used to the old way.
I think you'll have guessed it, but I'm one of the users that is a bit uncomfortable with this...
I always try to automate as mush as i can. I had some automation that were using the custom name. Right now, with my setup, if I reinstall a new home assistant and I use only my yaml, the automation won't work unless I go change manually the sensor name after the first reboot.
Having said that, I can understand what it can give to the user to have the core support done like that.
For me, I'll just change my automation to use the default name and everything will work as expected.
The previous default names were poor for automation. Hopefully the new entity names are easier to work with. If you do have a suggestion to improve, let me know.
I agree with you, the old default name were kind of bad... That's why I was using custom one.
Knowing the new default name, I'm kinda happy with the change and I'm not asking for more.
Having said that, if you want to do more modification, you could keep both behavior by changing the title to something like that :
title = user_input.get(CONF_NAME, host + (f":{port}" if port != DEFAULT_PORT else ""))
That way, we can still override the name with the configuration.yaml and we are sure that it'll not break other users configuration if they were like me and not using the Entities editor.
You also get the better default name.
That's reasonable, I'll add it back in later today. But remember that you can't _update_ the name from the config file and that it's only used during creation.
After starting to reimplement this I'm second-guessing adding it back. I see it as a confusing option that will probably lead to more issue reports.
That way, we can still override the name with the configuration.yaml and we are sure that it'll not break other users configuration if they were like me and not using the Entities editor.
Every entity with a unique ID (like these) is already in the registry, so you're already using it today. Removing this option will not change existing entity IDs, but will give new standardized display names if you haven't set one yourself.
Once you override the display name (or entity ID) in the UI, that becomes its new permanent value. Configuring the entity name via YAML options on the integration is a legacy method that predates the registry. I think it's far better to remove the name option to reduce complexity both in the integration code and how it is documented/used.
I'm confortable with that, and as you said earlier, the default name make sense now.
Do you want I update the documentation ?
I would :
Sure, I'd appreciate that!
Most helpful comment
Ok looking at the code between this integration and the HTTP server, there's definitely a race as they both try to init at the same time.