Nomad v0.12.3
(I could replicate it in all versions after v0.12.1, but not on v0.12.0)
OSX v10.15.5
I'm simply running the agent in -dev to reproduce it.
This seems to happen only on periodic jobs.
After a job is normally run or forced, the job children summary ends up with 1 pending and 1 done.
Seems like it initially increases the pending count to 2 and eventually changes the count to 1 pending and 1 running, and eventually to 1 pending and 1 done.
Keep in mind: I'm only running the job once and in fact, only one job is launched.
If I let it run again, the same happens, except I will end up with 2 pending and 2 done (forever).
ID = test
Name = test
Submit Date = 2020-08-19T15:27:10+01:00
Type = batch
Priority = 50
Datacenters = dc1
Namespace = default
Status = running
Periodic = true
Parameterized = false
Next Periodic Launch = 2020-08-19T15:07:00Z (34m45s from now)
Children Job Summary
Pending Running Dead
2 0 2
Previously Launched Jobs
ID Status
test/periodic-1597847482 dead
test/periodic-1597847530 dead
Of course, I would expect to see it going from 1 pending to 1 running and ending up in 1 dead and on and on.
For each new job launch, there's always a new extra pending in the summary.
nomad agent -devnomad job run test.nomad which schedules the job to eventually run.nomad job periodic force testnomad job status test to check the above output (or similar) with X pending and X done.ID = test
Name = test
Submit Date = 2020-08-19T15:41:15+01:00
Type = batch
Priority = 50
Datacenters = dc1
Namespace = default
Status = running
Periodic = true
Parameterized = false
Next Periodic Launch = 2020-08-19T14:42:00Z (31s from now)
Children Job Summary
Pending Running Dead
1 0 1
Previously Launched Jobs
ID Status
test/periodic-1597848080 dead
Job file from the example above (test.nomad):
job "test" {
datacenters = ["dc1"]
type = "batch"
periodic {
cron = "* * * * *"
prohibit_overlap = true
}
group "group" {
task "task" {
driver = "docker"
config {
image = "alpine:latest"
command = "ls"
}
}
}
}
==> No configuration files loaded
==> Starting Nomad agent...
==> Nomad agent configuration:
Advertise Addrs: HTTP: 127.0.0.1:4646; RPC: 127.0.0.1:4647; Serf: 127.0.0.1:4648
Bind Addrs: HTTP: 127.0.0.1:4646; RPC: 127.0.0.1:4647; Serf: 127.0.0.1:4648
Client: true
Log Level: DEBUG
Region: global (DC: dc1)
Server: true
Version: 0.12.3
==> Nomad agent started! Log data will stream in below:
2020-08-19T15:40:55.729+0100 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=
2020-08-19T15:40:55.729+0100 [DEBUG] agent.plugin_loader.docker: using client connection initialized from environment: plugin_dir=
2020-08-19T15:40:55.729+0100 [INFO] agent: detected plugin: name=docker type=driver plugin_version=0.1.0
2020-08-19T15:40:55.729+0100 [INFO] agent: detected plugin: name=mock_driver type=driver plugin_version=0.1.0
2020-08-19T15:40:55.729+0100 [INFO] agent: detected plugin: name=raw_exec type=driver plugin_version=0.1.0
2020-08-19T15:40:55.729+0100 [INFO] agent: detected plugin: name=exec type=driver plugin_version=0.1.0
2020-08-19T15:40:55.729+0100 [INFO] agent: detected plugin: name=qemu type=driver plugin_version=0.1.0
2020-08-19T15:40:55.729+0100 [INFO] agent: detected plugin: name=java type=driver plugin_version=0.1.0
2020-08-19T15:40:55.730+0100 [INFO] nomad.raft: initial configuration: index=1 servers="[{Suffrage:Voter ID:127.0.0.1:4647 Address:127.0.0.1:4647}]"
2020-08-19T15:40:55.730+0100 [INFO] nomad.raft: entering follower state: follower="Node at 127.0.0.1:4647 [Follower]" leader=
2020-08-19T15:40:55.730+0100 [INFO] nomad: serf: EventMemberJoin: Brunos-MacBook-Pro.local.global 127.0.0.1
2020-08-19T15:40:55.730+0100 [INFO] nomad: starting scheduling worker(s): num_workers=16 schedulers=[system, service, batch, _core]
2020-08-19T15:40:55.730+0100 [INFO] nomad: adding server: server="Brunos-MacBook-Pro.local.global (Addr: 127.0.0.1:4647) (DC: dc1)"
2020-08-19T15:40:55.730+0100 [INFO] client: using state directory: state_dir=/private/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/NomadClient168750967
2020-08-19T15:40:55.730+0100 [INFO] client: using alloc directory: alloc_dir=/private/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/NomadClient025092714
2020-08-19T15:40:55.758+0100 [DEBUG] client.fingerprint_mgr: built-in fingerprints: fingerprinters=[arch, cni, consul, cpu, host, memory, network, nomad, signal, storage, vault, env_aws, env_gce]
2020-08-19T15:40:55.758+0100 [DEBUG] client.fingerprint_mgr: CNI config dir is not set or does not exist, skipping: cni_config_dir=
2020-08-19T15:40:55.758+0100 [DEBUG] client.fingerprint_mgr: fingerprinting periodically: fingerprinter=consul period=15s
2020-08-19T15:40:55.758+0100 [DEBUG] client.fingerprint_mgr.cpu: detected cpu frequency: MHz=2300
2020-08-19T15:40:55.758+0100 [DEBUG] client.fingerprint_mgr.cpu: detected core count: cores=16
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected and no speed specified by user, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: detected interface IP: interface=lo0 IP=127.0.0.1
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: detected interface IP: interface=lo0 IP=::1
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000 2020-08-19T15:41:01.850+0100 [DEBUG] client: updated allocations: index=1 total=0 pulled=0 filtered=0
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.785+0100 [DEBUG] client.fingerprint_mgr.network: link speed could not be detected, falling back to default speed: mbits=1000
2020-08-19T15:40:55.788+0100 [DEBUG] client.fingerprint_mgr: fingerprinting periodically: fingerprinter=vault period=15s
2020-08-19T15:40:57.134+0100 [WARN] nomad.raft: heartbeat timeout reached, starting election: last-leader=
2020-08-19T15:40:57.134+0100 [INFO] nomad.raft: entering candidate state: node="Node at 127.0.0.1:4647 [Candidate]" term=2
2020-08-19T15:40:57.134+0100 [DEBUG] nomad.raft: votes: needed=1
2020-08-19T15:40:57.134+0100 [DEBUG] nomad.raft: vote granted: from=127.0.0.1:4647 term=2 tally=1
2020-08-19T15:40:57.134+0100 [INFO] nomad.raft: election won: tally=1
2020-08-19T15:40:57.134+0100 [INFO] nomad.raft: entering leader state: leader="Node at 127.0.0.1:4647 [Leader]"
2020-08-19T15:40:57.134+0100 [INFO] nomad: cluster leadership acquired
2020-08-19T15:40:57.136+0100 [INFO] nomad.core: established cluster id: cluster_id=47606d41-60db-9b7a-f198-fc05a7b07d70 create_time=1597848057135864000
2020-08-19T15:41:01.850+0100 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=0 2020-08-19T15:41:01.789+0100 [DEBUG] client.fingerprint_mgr.env_gce: could not read value for attribute: attribute=machine-type error="Get "http://169.254.169.254/computeMetadata/v1/instance/machine-type": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
2020-08-19T15:41:01.789+0100 [DEBUG] client.fingerprint_mgr.env_gce: error querying GCE Metadata URL, skipping
2020-08-19T15:41:01.789+0100 [DEBUG] client.fingerprint_mgr: detected fingerprints: node_attrs=[arch, cpu, host, network, nomad, signal, storage]
2020-08-19T15:41:01.789+0100 [INFO] client.plugin: starting plugin manager: plugin-type=csi
2020-08-19T15:41:01.789+0100 [INFO] client.plugin: starting plugin manager: plugin-type=driver
2020-08-19T15:41:01.790+0100 [INFO] client.plugin: starting plugin manager: plugin-type=device
2020-08-19T15:41:01.790+0100 [DEBUG] client.device_mgr: exiting since there are no device plugins
2020-08-19T15:41:01.790+0100 [DEBUG] client.plugin: waiting on plugin manager initial fingerprint: plugin-type=device
2020-08-19T15:41:01.790+0100 [DEBUG] client.plugin: finished plugin manager initial fingerprint: plugin-type=device
2020-08-19T15:41:01.790+0100 [DEBUG] client.plugin: waiting on plugin manager initial fingerprint: plugin-type=driver
2020-08-19T15:41:01.790+0100 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=exec health=undetected description="exec driver unsupported on client OS"
2020-08-19T15:41:01.790+0100 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=raw_exec health=healthy description=Healthy
2020-08-19T15:41:01.790+0100 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=mock_driver health=healthy description=Healthy
2020-08-19T15:41:01.850+0100 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=0 errors=0
2020-08-19T15:41:01.790+0100 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=qemu health=undetected description=
2020-08-19T15:41:01.793+0100 [DEBUG] client.server_mgr: new server list: new_servers=[127.0.0.1:4647] old_servers=[]
2020-08-19T15:41:01.816+0100 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=docker health=healthy description=Healthy
2020-08-19T15:41:01.849+0100 [DEBUG] client.driver_mgr: initial driver fingerprint: driver=java health=healthy description=Healthy
2020-08-19T15:41:01.849+0100 [DEBUG] client.driver_mgr: detected drivers: drivers="map[healthy:[raw_exec mock_driver docker java] undetected:[exec qemu]]"
2020-08-19T15:41:01.849+0100 [DEBUG] client.plugin: finished plugin manager initial fingerprint: plugin-type=driver
2020-08-19T15:41:01.849+0100 [INFO] client: started client: node_id=b464aded-c53f-ad31-26fd-ae2019dae6b7
2020-08-19T15:41:01.851+0100 [INFO] client: node registration complete
2020-08-19T15:41:01.852+0100 [DEBUG] client: state updated: node_status=ready
2020-08-19T15:41:02.851+0100 [DEBUG] client: state changed, updating node and re-registering
2020-08-19T15:41:02.853+0100 [INFO] client: node registration complete
2020-08-19T15:41:15.440+0100 [DEBUG] nomad.periodic: registered periodic job: job="<ns: "default", id: "test">"
2020-08-19T15:41:15.440+0100 [DEBUG] nomad.periodic: scheduled periodic job launch: launch_delay=44.55994s job="<ns: "default", id: "test">"
2020-08-19T15:41:15.440+0100 [DEBUG] http: request complete: method=PUT path=/v1/jobs duration=1.62014ms
2020-08-19T15:41:20.559+0100 [DEBUG] http: request complete: method=GET path=/v1/jobs?prefix=test duration=222.283碌s
2020-08-19T15:41:20.560+0100 [DEBUG] worker: dequeued evaluation: eval_id=0c9cb28c-0b82-14fb-0d7f-cd7750792e86
2020-08-19T15:41:20.560+0100 [DEBUG] http: request complete: method=PUT path=/v1/job/test/periodic/force?namespace=default duration=570.576碌s
2020-08-19T15:41:20.560+0100 [DEBUG] worker.batch_sched: reconciled current state with desired state: eval_id=0c9cb28c-0b82-14fb-0d7f-cd7750792e86 job_id=test/periodic-1597848080 namespace=default results="Total changes: (place 1) (destructive 0) (inplace 0) (stop 0)
Desired Changes for "group": (place 1) (inplace 0) (destructive 0) (stop 0) (migrate 0) (ignore 0) (canary 0)"
2020-08-19T15:41:20.561+0100 [DEBUG] http: request complete: method=GET path=/v1/evaluation/0c9cb28c-0b82-14fb-0d7f-cd7750792e86 duration=452.598碌s
2020-08-19T15:41:20.561+0100 [DEBUG] worker: submitted plan for evaluation: eval_id=0c9cb28c-0b82-14fb-0d7f-cd7750792e86
2020-08-19T15:41:20.562+0100 [DEBUG] worker.batch_sched: setting eval status: eval_id=0c9cb28c-0b82-14fb-0d7f-cd7750792e86 job_id=test/periodic-1597848080 namespace=default status=complete
2020-08-19T15:41:20.562+0100 [DEBUG] worker: updated evaluation: eval="<Eval "0c9cb28c-0b82-14fb-0d7f-cd7750792e86" JobID: "test/periodic-1597848080" Namespace: "default">"
2020-08-19T15:41:20.562+0100 [DEBUG] client: updated allocations: index=12 total=1 pulled=1 filtered=0
2020-08-19T15:41:20.562+0100 [DEBUG] client: allocation updates: added=1 removed=0 updated=0 ignored=0
2020-08-19T15:41:20.563+0100 [DEBUG] worker: ack evaluation: eval_id=0c9cb28c-0b82-14fb-0d7f-cd7750792e86
2020-08-19T15:41:20.563+0100 [DEBUG] http: request complete: method=GET path=/v1/evaluation/0c9cb28c-0b82-14fb-0d7f-cd7750792e86/allocations duration=1.100824ms
2020-08-19T15:41:20.564+0100 [DEBUG] client: allocation updates applied: added=1 removed=0 updated=0 ignored=0 errors=0
2020-08-19T15:41:20.565+0100 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: starting plugin: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task path=/usr/local/bin/nomad args=[/usr/local/bin/nomad, logmon]
2020-08-19T15:41:20.567+0100 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin started: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task path=/usr/local/bin/nomad pid=8392
2020-08-19T15:41:20.567+0100 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: waiting for RPC address: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task path=/usr/local/bin/nomad
2020-08-19T15:41:20.575+0100 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: using plugin: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task version=2
2020-08-19T15:41:20.575+0100 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon.nomad: plugin address: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task @module=logmon address=/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/plugin734508847 network=unix timestamp=2020-08-19T15:41:20.575+0100
2020-08-19T15:41:20.577+0100 [INFO] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task @module=logmon path=/private/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/NomadClient025092714/5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6/alloc/logs/.task.stdout.fifo timestamp=2020-08-19T15:41:20.577+0100
2020-08-19T15:41:20.577+0100 [INFO] client.alloc_runner.task_runner.task_hook.logmon.nomad: opening fifo: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task @module=logmon path=/private/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/NomadClient025092714/5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6/alloc/logs/.task.stderr.fifo timestamp=2020-08-19T15:41:20.577+0100
2020-08-19T15:41:20.706+0100 [DEBUG] client: updated allocations: index=14 total=1 pulled=0 filtered=1
2020-08-19T15:41:20.706+0100 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2020-08-19T15:41:20.706+0100 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2020-08-19T15:41:21.567+0100 [DEBUG] http: request complete: method=GET path=/v1/evaluation/0c9cb28c-0b82-14fb-0d7f-cd7750792e86 duration=758.527碌s
2020-08-19T15:41:21.568+0100 [DEBUG] http: request complete: method=GET path=/v1/evaluation/0c9cb28c-0b82-14fb-0d7f-cd7750792e86/allocations duration=244.94碌s
2020-08-19T15:41:22.748+0100 [DEBUG] client.driver_mgr.docker: docker pull succeeded: driver=docker image_ref=alpine:latest
2020-08-19T15:41:22.751+0100 [DEBUG] client.driver_mgr.docker: image reference count incremented: driver=docker image_name=alpine:latest image_id=sha256:a24bb4013296f61e89ba57005a7b3e52274d8edd3ae2077d04395f806b63d83e references=1
2020-08-19T15:41:22.751+0100 [DEBUG] client.driver_mgr.docker: configured resources: driver=docker task_name=task memory=314572800 memory_reservation=0 cpu_shares=100 cpu_quota=0 cpu_period=0
2020-08-19T15:41:22.751+0100 [DEBUG] client.driver_mgr.docker: binding directories: driver=docker task_name=task binds="[]string{"/private/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/NomadClient025092714/5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6/alloc:/alloc", "/private/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/NomadClient025092714/5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6/task/local:/local", "/private/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/NomadClient025092714/5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6/task/secrets:/secrets"}"
2020-08-19T15:41:22.751+0100 [DEBUG] client.driver_mgr.docker: networking mode not specified; using default: driver=docker task_name=task
2020-08-19T15:41:22.751+0100 [DEBUG] client.driver_mgr.docker: setting container startup command: driver=docker task_name=task command=ls
2020-08-19T15:41:22.751+0100 [DEBUG] client.driver_mgr.docker: applied labels on the container: driver=docker task_name=task labels=map[com.hashicorp.nomad.alloc_id:5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6]
2020-08-19T15:41:22.751+0100 [DEBUG] client.driver_mgr.docker: setting container name: driver=docker task_name=task container_name=task-5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6
2020-08-19T15:41:22.800+0100 [INFO] client.driver_mgr.docker: created container: driver=docker container_id=2da5b76896672d90e22321fa8ca295f9efa9a519205706ce506856f8e331f697
2020-08-19T15:41:23.024+0100 [INFO] client.driver_mgr.docker: started container: driver=docker container_id=2da5b76896672d90e22321fa8ca295f9efa9a519205706ce506856f8e331f697
2020-08-19T15:41:23.024+0100 [DEBUG] client.driver_mgr.docker.docker_logger: starting plugin: driver=docker path=/usr/local/bin/nomad args=[/usr/local/bin/nomad, docker_logger]
2020-08-19T15:41:23.027+0100 [DEBUG] client.driver_mgr.docker.docker_logger: plugin started: driver=docker path=/usr/local/bin/nomad pid=8417
2020-08-19T15:41:23.027+0100 [DEBUG] client.driver_mgr.docker.docker_logger: waiting for RPC address: driver=docker path=/usr/local/bin/nomad
2020-08-19T15:41:23.035+0100 [DEBUG] client.driver_mgr.docker.docker_logger.nomad: plugin address: driver=docker network=unix @module=docker_logger address=/var/folders/6y/sbdf048s6tj5bb4l6swdqcm80000gn/T/plugin615273124 timestamp=2020-08-19T15:41:23.035+0100
2020-08-19T15:41:23.035+0100 [DEBUG] client.driver_mgr.docker.docker_logger: using plugin: driver=docker version=2
2020-08-19T15:41:23.036+0100 [DEBUG] client.driver_mgr.docker.docker_logger.nomad: using client connection initialized from environment: driver=docker @module=docker_logger timestamp=2020-08-19T15:41:23.036+0100
2020-08-19T15:41:23.101+0100 [DEBUG] client: updated allocations: index=15 total=1 pulled=0 filtered=1
2020-08-19T15:41:23.101+0100 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2020-08-19T15:41:23.101+0100 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2020-08-19T15:41:23.178+0100 [DEBUG] client.driver_mgr.docker.docker_logger: plugin process exited: driver=docker path=/usr/local/bin/nomad pid=8417
2020-08-19T15:41:23.178+0100 [DEBUG] client.driver_mgr.docker.docker_logger: plugin exited: driver=docker
2020-08-19T15:41:23.179+0100 [DEBUG] client.alloc_runner.task_runner.task_hook.stats_hook: failed to start stats collection for task with unrecoverable error: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task error="container stopped"
2020-08-19T15:41:23.183+0100 [DEBUG] client.driver_mgr.docker: image id reference count decremented: driver=docker image_id=sha256:a24bb4013296f61e89ba57005a7b3e52274d8edd3ae2077d04395f806b63d83e references=0
2020-08-19T15:41:23.183+0100 [INFO] client.alloc_runner.task_runner: not restarting task: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task reason="Restart unnecessary as task terminated successfully"
2020-08-19T15:41:23.184+0100 [INFO] client.gc: marking allocation for GC: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6
2020-08-19T15:41:23.185+0100 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin process exited: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task path=/usr/local/bin/nomad pid=8392
2020-08-19T15:41:23.185+0100 [DEBUG] client.alloc_runner.task_runner.task_hook.logmon: plugin exited: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task
2020-08-19T15:41:23.185+0100 [DEBUG] client.alloc_runner.task_runner: task run loop exiting: alloc_id=5c9ba6fc-3031-7ee1-e233-2f0aee40c9c6 task=task
2020-08-19T15:41:23.302+0100 [DEBUG] client: updated allocations: index=16 total=1 pulled=0 filtered=1
2020-08-19T15:41:23.302+0100 [DEBUG] client: allocation updates: added=0 removed=0 updated=0 ignored=1
2020-08-19T15:41:23.302+0100 [DEBUG] client: allocation updates applied: added=0 removed=0 updated=0 ignored=1 errors=0
2020-08-19T15:41:29.807+0100 [DEBUG] http: request complete: method=GET path=/v1/jobs?prefix=test duration=265.894碌s
2020-08-19T15:41:29.809+0100 [DEBUG] http: request complete: method=GET path=/v1/job/test?namespace=default duration=1.267357ms
2020-08-19T15:41:29.811+0100 [DEBUG] http: request complete: method=GET path=/v1/job/test/summary?namespace=default duration=217.174碌s
2020-08-19T15:41:29.812+0100 [DEBUG] http: request complete: method=GET path=/v1/jobs?prefix=test%2Fperiodic- duration=111.132碌s
Hi @Brinuz. Thanks for providing an easy reproduction on this. We'll take a look
Hi @tgross, just encountered this too. Could this have something to do with the changes in #8435 ?
Hey @jorgemarey this is related to the changes in #8435, The team is discussing a few approaches to fix the issue.
The index == job.CreateIndex check was valid prior to #8435 but now it causessetJobStatus runs twice causing the count inaccurately increase
https://github.com/hashicorp/nomad/blob/9c3ce6b6dc88a0635fecc33f7ce3b8e03061d62c/nomad/state/state_store.go#L4425-L4432
I am working on further debugging and testing and will hopefully have another update soon
Most helpful comment
Hi @Brinuz. Thanks for providing an easy reproduction on this. We'll take a look