From the discussion I have seen, nomad plans to lean heavily on consul for service discovery and health checks.
While consul provides the ability to perform a check by running a script, the script must exist within the context of consul. For example, if I have a HDFS cluster running on nomad, it would be a good idea to perform a check by using the hadoop command. Currently, as it stands, a copy of the hadoop binary would need to be available to consul, which complicates deployment (what if we updated the HDFS containers to a new version and need to deploy new hadoop binaries to all consul agents?
It would be really awesome, if through some sort of bridge, consul can run the check within the docker container or task. For example, if we could ask consul to run a command using the hadoop binary inside the container, it would make things much simpler.
I know there's also the possibility of creating a ttl health check by having a process inside the container run the checks and then send the results to consul over http, but this creates more moving parts and can become complex quickly.
@F21 We are currently adding support in Consul to run a script inside a Docker container for doing health checks. When Nomad registers a Task with Consul if the user has defined health check in a job we will pass that information to Consul too, which will be then used by Consul to start doing health checks by running scripts inside the Docker container.
If a user is running tasks which are exec or raw exec then Consul could do health checks by running the scripts directly on the host.
If the user is using Qemu to run KVM or Xen based workloads then Consul would support health checks which are only HTTP, TCP or TTL since Consul won't be able to execute a script within a VM.
Awesome! In the case of consul running a script directly on the host for exec or raw exec jobs, I would assume it's doing so using the consul remote execution feature right?
@F21 I think we can avoid remote execution for doing healthchecks for tasks which are running via exec and raw exec on Nomad by running Consul Agent as a system task on the node in the raw exec mode. So Consul would be able to run any script on the host or in any chroot.
That sounds like the way to do it! I hope consul support is added to nomad soon, would love to give that a spin!
Hi @diptanu,
I see that the Docker check has been added to Consul 0.6 release.
According to the docs (https://consul.io/docs/agent/checks.html):
{
"check": {
"id": "mem-util",
"name": "Memory utilization",
"docker_container_id": "f972c95ebf0e",
"shell": "/bin/bash",
"script": "/usr/local/bin/check_mem.py",
"interval": "10s"
}
}
How can I pass to the service check, the container ID from a nomad jobfile? is there any Nomad variable available to use the container_id or container_name value already?
thanks!
@poll0rz The only check types supported for 0.2.1 was http and tcp. I will try to integrate the script type check for 0.3
@diptanu great, thanks!
I will try to integrate the script type check for 0.3
Wow, bring it on!
hi @diptanu, this wasn't finally included in 0.3 right?
I have microservices that aren't listening in a port, they just consume Kafka messages and write in Cassandra. For now, I register them with a 'dummy port', so they're always in Consul marked as critical.
thanks!! great work
@adrianlop This is slotted for 0.3.1! Finally getting around to start working on this. We need to do this across all our drivers, so taking some time.
Fixed via #986
@adrianlop @F21 @pires This is done now, and would go out with Nomad 0.3.2
Most helpful comment
@adrianlop @F21 @pires This is done now, and would go out with Nomad 0.3.2