When using TLS for Consul connectivity, it is obviously preferable to use low TTL certificates with frequent renewal. Currently in order to renew the Consul certificates the Nomad client must be restarted. It would be preferable if the Nomad client could use SIGUP in order to reload the Consul client/config like reloading the Nomad TLS certificates. This would be a much less intrusive operation and involve less risk.
I assume this only applies if you configured the Nomad agent to talk with the Consul agent over HTTPS (https://www.nomadproject.io/docs/agent/configuration/consul.html#ssl)? If you use plain HTTP (over localhost) and you've configured Consul with short-lived certificates then you'll only need to restart the Consul agent when certificate renewal occurs.
@rkettelerij you're correct.
Allowing reloading of the Consul config stanza would also allow for refreshing the consul ACL token as well which I suspect would be a more common use-case.
I'd like to see both reloading of the acl token as well as reloading client certificates when reloading the nomad agent. we are obligated to use verify_incoming on the consul agents and using short-living certificates, and restarting nomad agent at every cert and/or token renewal is painful.
I'd really like this feature as well. Nomad is really lacking in support for reloading TLS configurations. Right now you can only update the tls configuration for the nomad agents themselves. That doesn't help you when your entire cluster (i.e., Nomad + Vault + Consul) uses the same root CA. You end up in a situation where you might not lose your quorum, but you can't actually schedule any work. There are many issues that I've run into which are all related to this core problem (#3247, #3746, #4413, #4593, #6052).
I've been doing a little bit of digging and it seems like the reloading logic is scattered across the agent, client, and server code. So the reloading logic is very inconsistent across the board. From what I can gather, we seem to be in this state:
The other downside to all of this is that because Nomad has partial support for SIGHUP reloading, you'd think that you could use some combination of reloads + a full restart to refresh all of your tls configuration - but if you don't orchestrate it right, you run into #3885. This is a major problem which really hurts the operational side of things. Its a shame because other Hashicorp tools like Vault and consul-template already support reloading tls configurations via SIGHUP. I know that Nomad has far more configurations to update, but honestly this has been a big problem for at least the last 3 years I've used Nomad. Rolling your own PKI with Vault and using that in your hashistack cluster should be a best practice!
I would really like to help address this problem but I think this may require some significant refactoring to enable this. Any help would be greatly appreciated.
Most helpful comment
I'd really like this feature as well. Nomad is really lacking in support for reloading TLS configurations. Right now you can only update the tls configuration for the nomad agents themselves. That doesn't help you when your entire cluster (i.e., Nomad + Vault + Consul) uses the same root CA. You end up in a situation where you might not lose your quorum, but you can't actually schedule any work. There are many issues that I've run into which are all related to this core problem (#3247, #3746, #4413, #4593, #6052).
I've been doing a little bit of digging and it seems like the reloading logic is scattered across the agent, client, and server code. So the reloading logic is very inconsistent across the board. From what I can gather, we seem to be in this state:
Agent
Server
Clients
The other downside to all of this is that because Nomad has partial support for SIGHUP reloading, you'd think that you could use some combination of reloads + a full restart to refresh all of your tls configuration - but if you don't orchestrate it right, you run into #3885. This is a major problem which really hurts the operational side of things. Its a shame because other Hashicorp tools like Vault and consul-template already support reloading tls configurations via SIGHUP. I know that Nomad has far more configurations to update, but honestly this has been a big problem for at least the last 3 years I've used Nomad. Rolling your own PKI with Vault and using that in your hashistack cluster should be a best practice!
I would really like to help address this problem but I think this may require some significant refactoring to enable this. Any help would be greatly appreciated.