Nomad: task health check to kill & replace

Created on 30 Sep 2015  路  9Comments  路  Source: hashicorp/nomad

It would be great to be able to register a task health check in a job definition (a la Consul) that, optionally, could be leveraged by Nomad to kill and replace a failed task.

themdiscovery typenhancement

Most helpful comment

I'm not sure what the priority is for this, but IMHO it should be of highest priority.
Using service health checks but no task health checks, Nomad creates a zombie fleet of tasks.
What exactly is the point of having unhealthy jobs non-discoverable but still running?

All 9 comments

This will be included in our consul integration (probably in 0.2).

This would be really nice to have. Are there any updates @cbednarski?

@cbednarski is this already implemented?

@abhidrona No it is not! Nomad will register services and checks but does not use the state of the checks to restart tasks.

I'm not sure what the priority is for this, but IMHO it should be of highest priority.
Using service health checks but no task health checks, Nomad creates a zombie fleet of tasks.
What exactly is the point of having unhealthy jobs non-discoverable but still running?

This also is one of the very few issues that keep us from switching over from mesos to nomad. (We run nomad in our staging environment, but can't afford to manually restart tasks in production.)

Closing in favor of https://github.com/hashicorp/nomad/issues/876 given more people have seen that one.

@dadgar this is #164

Fixed reference! My bad!

Was this page helpful?
0 / 5 - 0 ratings

Related issues

stongo picture stongo  路  3Comments

jippi picture jippi  路  3Comments

DanielDent picture DanielDent  路  3Comments

bdclark picture bdclark  路  3Comments

byronwolfman picture byronwolfman  路  3Comments