Nomad: [feature] allow reload of Consul config stanza and Consul client

Created on 17 Aug 2018  路  5Comments  路  Source: hashicorp/nomad

When using TLS for Consul connectivity, it is obviously preferable to use low TTL certificates with frequent renewal. Currently in order to renew the Consul certificates the Nomad client must be restarted. It would be preferable if the Nomad client could use SIGUP in order to reload the Consul client/config like reloading the Nomad TLS certificates. This would be a much less intrusive operation and involve less risk.

themconsul themtls typenhancement

Most helpful comment

I'd really like this feature as well. Nomad is really lacking in support for reloading TLS configurations. Right now you can only update the tls configuration for the nomad agents themselves. That doesn't help you when your entire cluster (i.e., Nomad + Vault + Consul) uses the same root CA. You end up in a situation where you might not lose your quorum, but you can't actually schedule any work. There are many issues that I've run into which are all related to this core problem (#3247, #3746, #4413, #4593, #6052).

I've been doing a little bit of digging and it seems like the reloading logic is scattered across the agent, client, and server code. So the reloading logic is very inconsistent across the board. From what I can gather, we seem to be in this state:

Agent

  • Reload tls configuration? YES
  • Reload vault configuration? N/A - vault client is tied to servers + clients, not agent.
  • Reload consul configuration? NO. Not supported.

Server

  • Reload tls configuration? YES
  • Reload vault configuration? NO* - Servers will reload ONLY if you change the path to your CAFile, CertFile, or KeyFile. If you reload a new cert to the same file, it does not reload. I opened #6677 to try and take a stab at fixing it. I tried testing with it, but I'm still running into vault integration issues so it makes me feel like there's more that is missing.
  • Reload consul configuration? N/A - consul client is tied to agent, not server.

Clients

  • Reload tls configuration? YES
  • Reload vault configuration? NO. Not supported. Seems like the client's vault integration is much different than the servers, so it seems like it would take some refactoring to support this.
  • Reload consul configuration? N/A - consul client is tied to agent, not client.

The other downside to all of this is that because Nomad has partial support for SIGHUP reloading, you'd think that you could use some combination of reloads + a full restart to refresh all of your tls configuration - but if you don't orchestrate it right, you run into #3885. This is a major problem which really hurts the operational side of things. Its a shame because other Hashicorp tools like Vault and consul-template already support reloading tls configurations via SIGHUP. I know that Nomad has far more configurations to update, but honestly this has been a big problem for at least the last 3 years I've used Nomad. Rolling your own PKI with Vault and using that in your hashistack cluster should be a best practice!

I would really like to help address this problem but I think this may require some significant refactoring to enable this. Any help would be greatly appreciated.

All 5 comments

I assume this only applies if you configured the Nomad agent to talk with the Consul agent over HTTPS (https://www.nomadproject.io/docs/agent/configuration/consul.html#ssl)? If you use plain HTTP (over localhost) and you've configured Consul with short-lived certificates then you'll only need to restart the Consul agent when certificate renewal occurs.

@rkettelerij you're correct.

Allowing reloading of the Consul config stanza would also allow for refreshing the consul ACL token as well which I suspect would be a more common use-case.

I'd like to see both reloading of the acl token as well as reloading client certificates when reloading the nomad agent. we are obligated to use verify_incoming on the consul agents and using short-living certificates, and restarting nomad agent at every cert and/or token renewal is painful.

I'd really like this feature as well. Nomad is really lacking in support for reloading TLS configurations. Right now you can only update the tls configuration for the nomad agents themselves. That doesn't help you when your entire cluster (i.e., Nomad + Vault + Consul) uses the same root CA. You end up in a situation where you might not lose your quorum, but you can't actually schedule any work. There are many issues that I've run into which are all related to this core problem (#3247, #3746, #4413, #4593, #6052).

I've been doing a little bit of digging and it seems like the reloading logic is scattered across the agent, client, and server code. So the reloading logic is very inconsistent across the board. From what I can gather, we seem to be in this state:

Agent

  • Reload tls configuration? YES
  • Reload vault configuration? N/A - vault client is tied to servers + clients, not agent.
  • Reload consul configuration? NO. Not supported.

Server

  • Reload tls configuration? YES
  • Reload vault configuration? NO* - Servers will reload ONLY if you change the path to your CAFile, CertFile, or KeyFile. If you reload a new cert to the same file, it does not reload. I opened #6677 to try and take a stab at fixing it. I tried testing with it, but I'm still running into vault integration issues so it makes me feel like there's more that is missing.
  • Reload consul configuration? N/A - consul client is tied to agent, not server.

Clients

  • Reload tls configuration? YES
  • Reload vault configuration? NO. Not supported. Seems like the client's vault integration is much different than the servers, so it seems like it would take some refactoring to support this.
  • Reload consul configuration? N/A - consul client is tied to agent, not client.

The other downside to all of this is that because Nomad has partial support for SIGHUP reloading, you'd think that you could use some combination of reloads + a full restart to refresh all of your tls configuration - but if you don't orchestrate it right, you run into #3885. This is a major problem which really hurts the operational side of things. Its a shame because other Hashicorp tools like Vault and consul-template already support reloading tls configurations via SIGHUP. I know that Nomad has far more configurations to update, but honestly this has been a big problem for at least the last 3 years I've used Nomad. Rolling your own PKI with Vault and using that in your hashistack cluster should be a best practice!

I would really like to help address this problem but I think this may require some significant refactoring to enable this. Any help would be greatly appreciated.

Was this page helpful?
0 / 5 - 0 ratings