Output from nomad version:
(latest master)
Nomad v0.9.2-dev (50fc86ad26a3bf4caac829b1f9a1c8fbd8cb2a6c)
/tmp :: go version
go version go1.12.5 linux/amd64
/tmp :: uname -a
Linux mediocre-desktop 5.1.6-arch1-1-ARCH #1 SMP PREEMPT Fri May 31 15:17:53 UTC 2019 x86_64 GNU/Linux
I'm looking into nomad and learning more about it, and was trying to follow the steps outlined here. In doing so the job seems to come up unhealthily, though I'm not totally sure how to diagnose this further. Any help would be appreciated!
I tested this with 1.9.0, 1.9.1, and current master and got the same results each time.
/tmp :: sudo nomad agent -dev 2>&1 >/tmp/nomad.log &
[1] 14862
/tmp :: nomad job status
No running jobs
/tmp :: nomad job init
Example job file written to example.nomad
/tmp :: nomad job run example.nomad
==> Monitoring evaluation "3eab3657"
Evaluation triggered by job "example"
Allocation "0dac812f" created: node "bd3e4788", group "cache"
Evaluation within deployment: "9d2b4751"
Allocation "0dac812f" status changed: "pending" -> "running" (Tasks are running)
Evaluation status changed: "pending" -> "complete"
==> Evaluation "3eab3657" finished with status "complete"
#####################################################################
/tmp :: nomad job status example
ID = example
Name = example
Submit Date = 2019-06-04T21:37:40-04:00
Type = service
Priority = 50
Datacenters = dc1
Status = running
Periodic = false
Parameterized = false
Summary
Task Group Queued Starting Running Failed Complete Lost
cache 0 0 1 0 0 0
Latest Deployment
ID = 9d2b4751
Status = running
Description = Deployment is running
Deployed
Task Group Desired Placed Healthy Unhealthy Progress Deadline
cache 1 1 0 0 2019-06-04T21:47:40-04:00
Allocations
ID Node ID Task Group Version Desired Status Created Modified
0dac812f bd3e4788 cache 0 run running 1m29s ago 1m29s ago
/tmp :: nomad alloc status 0dac812f
ID = 0dac812f
Eval ID = 3eab3657
Name = example.cache[0]
Node ID = bd3e4788
Node Name = mediocre-desktop
Job ID = example
Job Version = 0
Client Status = running
Client Description = Tasks are running
Desired Status = run
Desired Description = <none>
Created = 1m43s ago
Modified = 1m43s ago
Deployment ID = 9d2b4751
Deployment Health = unset
Task "redis" is "running"
Task Resources
CPU Memory Disk Addresses
1/500 MHz 1.1 MiB/256 MiB 300 MiB db: 127.0.0.1:21802
Task Events:
Started At = 2019-06-05T01:37:41Z
Finished At = N/A
Total Restarts = 0
Last Restart = N/A
Recent Events:
Time Type Description
2019-06-04T21:37:41-04:00 Started Task started by client
2019-06-04T21:37:40-04:00 Task Setup Building Task Directory
2019-06-04T21:37:40-04:00 Received Task received by client
#####################################################################
# Wait 1 minute
#####################################################################
/tmp :: nomad alloc status 0dac812f
ID = 0dac812f
Eval ID = 3eab3657
Name = example.cache[0]
Node ID = bd3e4788
Node Name = mediocre-desktop
Job ID = example
Job Version = 0
Client Status = running
Client Description = Tasks are running
Desired Status = run
Desired Description = <none>
Created = 3m36s ago
Modified = 36s ago
Deployment ID = 9d2b4751
Deployment Health = unhealthy
Task "redis" is "running"
Task Resources
CPU Memory Disk Addresses
1/500 MHz 1.1 MiB/256 MiB 300 MiB db: 127.0.0.1:21802
Task Events:
Started At = 2019-06-05T01:37:41Z
Finished At = N/A
Total Restarts = 0
Last Restart = N/A
Recent Events:
Time Type Description
2019-06-04T21:40:40-04:00 Alloc Unhealthy Task not running for min_healthy_time of 10s by deadline
2019-06-04T21:37:41-04:00 Started Task started by client
2019-06-04T21:37:40-04:00 Task Setup Building Task Directory
2019-06-04T21:37:40-04:00 Received Task received by client
The one returned from nomad job init.
Client/server logs here
@mediocregopher the example job has a service stanza that will run a healthcheck in Consul. (https://www.consul.io/) You may not have that set up. Try commenting that stanza out and rerunning the job.
We'll consider simplifying the example job in the future to avoid the dependency on Consul, thanks for reporting.
Ah ok, that makes sense. The documentation I linked doesn't mention consul at all, as far as I could tell. Thanks for the help!
Most helpful comment
@mediocregopher the example job has a
servicestanza that will run a healthcheck in Consul. (https://www.consul.io/) You may not have that set up. Try commenting that stanza out and rerunning the job.We'll consider simplifying the example job in the future to avoid the dependency on Consul, thanks for reporting.