0.5.1-rc1
golang docker image
landscape: 1 server and 3 clients in seperate docker images
We use constraints to run jobs selectively on nodes. When trying to stop a allocation on a certain node, we experienced another allocation in unrelated node is killed as well.
ID = test
Name = test
Type = service
Priority = 50
Datacenters = dc1
Status = running
Periodic = false
Summary
Task Group Queued Starting Running Failed Complete Lost
test 0 0 3 0 0 0
Evaluations
ID Priority Triggered By Status Placement Failures
a4e9ce2d-54f1-c033-f337-cf989a26c3b8 50 job-register complete false
Allocations
ID Eval ID Node ID Task Group Desired Status Created At
5c8fdff9-6059-c8b4-90e2-0b0967e2b9c2 a4e9ce2d-54f1-c033-f337-cf989a26c3b8 77bff70c-6e19-14b0-2aba-2a111e74df64 test run running 12/09/16 16:46:41 +03
97623f70-0553-c0f4-2718-211f65a608c3 a4e9ce2d-54f1-c033-f337-cf989a26c3b8 300d9df9-c34a-2ca5-8f07-8d52e437d24c test run running 12/09/16 16:46:41 +03
9e57df5a-2795-3b64-fe3c-31c2ae24acae a4e9ce2d-54f1-c033-f337-cf989a26c3b8 b2da178c-e411-aa76-fa8b-b4884fd82ed7 test run running 12/09/16 16:46:41 +03
ID = test
Name = test
Type = service
Priority = 50
Datacenters = dc1
Status = running
Periodic = false
Summary
Task Group Queued Starting Running Failed Complete Lost
test 0 0 2 0 2 0
Evaluations
ID Priority Triggered By Status Placement Failures
85079a62-dbad-3aea-4d54-cb1384499a28 50 job-register complete false
a4e9ce2d-54f1-c033-f337-cf989a26c3b8 50 job-register complete false
Allocations
ID Eval ID Node ID Task Group Desired Status Created At
61531a3d-5536-d8fa-c465-2f2769293aae 85079a62-dbad-3aea-4d54-cb1384499a28 77bff70c-6e19-14b0-2aba-2a111e74df64 test run running 12/09/16 16:46:53 +03
5c8fdff9-6059-c8b4-90e2-0b0967e2b9c2 a4e9ce2d-54f1-c033-f337-cf989a26c3b8 77bff70c-6e19-14b0-2aba-2a111e74df64 test stop complete 12/09/16 16:46:41 +03
97623f70-0553-c0f4-2718-211f65a608c3 85079a62-dbad-3aea-4d54-cb1384499a28 300d9df9-c34a-2ca5-8f07-8d52e437d24c test run running 12/09/16 16:46:41 +03
9e57df5a-2795-3b64-fe3c-31c2ae24acae a4e9ce2d-54f1-c033-f337-cf989a26c3b8 b2da178c-e411-aa76-fa8b-b4884fd82ed7 test stop complete 12/09/16 16:46:41 +03
node-master_1 | 2016/12/09 13:46:41.916441 [DEBUG] worker: dequeued evaluation a4e9ce2d-54f1-c033-f337-cf989a26c3b8
node-master_1 | 2016/12/09 13:46:41.916543 [DEBUG] http: Request /v1/jobs?region=global (74.583865ms)
node-master_1 | 2016/12/09 13:46:41.917500 [DEBUG] sched: <Eval 'a4e9ce2d-54f1-c033-f337-cf989a26c3b8' JobID: 'test'>: allocs: (place 3) (update 0) (migrate 0) (stop 0) (ignore 0) (lost 0)
node-master_1 | 2016/12/09 13:46:41.920733 [DEBUG] http: Request /v1/evaluation/a4e9ce2d-54f1-c033-f337-cf989a26c3b8?region=global (199.263µs)
node-master_1 | 2016/12/09 13:46:41.924095 [DEBUG] http: Request /v1/evaluation/a4e9ce2d-54f1-c033-f337-cf989a26c3b8/allocations?region=global (328.009µs)
node-master_1 | 2016/12/09 13:46:41.954725 [DEBUG] worker: submitted plan for evaluation a4e9ce2d-54f1-c033-f337-cf989a26c3b8
node-master_1 | 2016/12/09 13:46:41.954789 [DEBUG] sched: <Eval 'a4e9ce2d-54f1-c033-f337-cf989a26c3b8' JobID: 'test'>: setting status to complete
node-master_1 | 2016/12/09 13:46:41.988504 [DEBUG] worker: updated evaluation <Eval 'a4e9ce2d-54f1-c033-f337-cf989a26c3b8' JobID: 'test'>
node-master_1 | 2016/12/09 13:46:41.988771 [DEBUG] worker: ack for evaluation a4e9ce2d-54f1-c033-f337-cf989a26c3b8
node-master_1 | 2016/12/09 13:46:42.660747 [DEBUG] http: Request /v1/status/peers (302.84µs)
node-master_1 | 2016/12/09 13:46:42.935293 [DEBUG] http: Request /v1/evaluation/a4e9ce2d-54f1-c033-f337-cf989a26c3b8?region=global (606.678µs)
node-master_1 | 2016/12/09 13:46:42.939550 [DEBUG] http: Request /v1/evaluation/a4e9ce2d-54f1-c033-f337-cf989a26c3b8/allocations?region=global (922.802µs)
node-master_1 | 2016/12/09 13:46:44 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:44 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:46:44 [INFO] agent: Deregistered check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:46:46 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:46 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:46 [INFO] agent: Synced check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:46:46 [DEBUG] memberlist: TCP connection from=172.19.0.2:42278
node-master_1 | 2016/12/09 13:46:46 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:46:46 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:46:46 [INFO] agent: Deregistered check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:46:46 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:46:47.652559 [DEBUG] http: Request /v1/jobs?prefix=test (359.173µs)
node-master_1 | 2016/12/09 13:46:47.655421 [DEBUG] http: Request /v1/job/test (256.395µs)
node-master_1 | 2016/12/09 13:46:47.658849 [DEBUG] http: Request /v1/job/test/allocations (448.766µs)
node-master_1 | 2016/12/09 13:46:47.660902 [DEBUG] http: Request /v1/job/test/evaluations (165.403µs)
node-master_1 | 2016/12/09 13:46:47.663151 [DEBUG] http: Request /v1/job/test/summary (133.541µs)
node-master_1 | 2016/12/09 13:46:49 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:49 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:46:49 [INFO] agent: Deregistered check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:46:51 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:51 [INFO] agent: Synced check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:46:51 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:51 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:46:51 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:46:51 [INFO] agent: Deregistered check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:46:52.664246 [DEBUG] http: Request /v1/status/peers (261.888µs)
node-master_1 | 2016/12/09 13:46:53.488955 [DEBUG] worker: dequeued evaluation 85079a62-dbad-3aea-4d54-cb1384499a28
node-master_1 | 2016/12/09 13:46:53.489154 [DEBUG] sched: <Eval '85079a62-dbad-3aea-4d54-cb1384499a28' JobID: 'test'>: allocs: (place 0) (update 2) (migrate 0) (stop 1) (ignore 0) (lost 0)
node-master_1 | 2016/12/09 13:46:53.488971 [DEBUG] http: Request /v1/jobs?region=global (46.215815ms)
node-master_1 | 2016/12/09 13:46:53.489551 [DEBUG] sched: <Eval '85079a62-dbad-3aea-4d54-cb1384499a28' JobID: 'test'>: 1 in-place updates of 2
node-master_1 | 2016/12/09 13:46:53.493578 [DEBUG] http: Request /v1/evaluation/85079a62-dbad-3aea-4d54-cb1384499a28?region=global (598.687µs)
node-master_1 | 2016/12/09 13:46:53.497262 [DEBUG] http: Request /v1/evaluation/85079a62-dbad-3aea-4d54-cb1384499a28/allocations?region=global (1.208662ms)
node-master_1 | 2016/12/09 13:46:53.524795 [DEBUG] worker: submitted plan for evaluation 85079a62-dbad-3aea-4d54-cb1384499a28
node-master_1 | 2016/12/09 13:46:53.524837 [DEBUG] sched: <Eval '85079a62-dbad-3aea-4d54-cb1384499a28' JobID: 'test'>: setting status to complete
node-master_1 | 2016/12/09 13:46:53.550077 [DEBUG] worker: updated evaluation <Eval '85079a62-dbad-3aea-4d54-cb1384499a28' JobID: 'test'>
node-master_1 | 2016/12/09 13:46:53.550150 [DEBUG] worker: ack for evaluation 85079a62-dbad-3aea-4d54-cb1384499a28
node-master_1 | 2016/12/09 13:46:54.500506 [DEBUG] http: Request /v1/evaluation/85079a62-dbad-3aea-4d54-cb1384499a28?region=global (218.34µs)
node-master_1 | 2016/12/09 13:46:54.502787 [DEBUG] http: Request /v1/evaluation/85079a62-dbad-3aea-4d54-cb1384499a28/allocations?region=global (280.865µs)
node-master_1 | 2016/12/09 13:46:54 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:54 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:46:54 [INFO] agent: Deregistered check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:46:55.676551 [DEBUG] http: Request /v1/jobs?prefix=test (333.004µs)
node-master_1 | 2016/12/09 13:46:55.678658 [DEBUG] http: Request /v1/job/test (179.087µs)
node-master_1 | 2016/12/09 13:46:55.684467 [DEBUG] http: Request /v1/job/test/allocations (223.334µs)
node-master_1 | 2016/12/09 13:46:55.687675 [DEBUG] http: Request /v1/job/test/evaluations (1.343601ms)
node-master_1 | 2016/12/09 13:46:55.689539 [DEBUG] http: Request /v1/job/test/summary (132.042µs)
node-master_1 | 2016/12/09 13:46:56 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:56 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:56 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:46:56 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:46:56 [DEBUG] memberlist: TCP connection from=172.19.0.2:42322
node-master_1 | 2016/12/09 13:46:56 [INFO] agent: Deregistered check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:46:56 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:46:59 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:46:59 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:46:59 [INFO] agent: Deregistered check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:01 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:01 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:01 [INFO] agent: Synced check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:47:01 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:01 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:01 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:01 [INFO] agent: Deregistered check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:47:02.666590 [DEBUG] http: Request /v1/status/peers (463.249µs)
node-master_1 | 2016/12/09 13:47:04 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:04 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:04 [INFO] agent: Deregistered check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:06 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:06 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:06 [INFO] agent: Synced check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:47:06 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:06 [DEBUG] memberlist: TCP connection from=172.19.0.2:42338
node-master_1 | 2016/12/09 13:47:06 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:06 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:06 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:09 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:09 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:09 [INFO] agent: Deregistered check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:47:09 [INFO] agent: Deregistered check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:11 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:11 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:11 [INFO] agent: Synced check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:47:11 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:11 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:11 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:11 [INFO] agent: Deregistered check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:47:12.669439 [DEBUG] http: Request /v1/status/peers (480.728µs)
node-master_1 | 2016/12/09 13:47:13 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:14 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:14 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:14 [INFO] agent: Deregistered check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:16 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:16 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:16 [INFO] agent: Synced check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:47:16 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:16 [INFO] agent: Synced check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:16 [INFO] agent: Deregistered check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:16 [DEBUG] memberlist: TCP connection from=172.19.0.2:42354
node-master_1 | 2016/12/09 13:47:16 [INFO] agent: Deregistered check '2e5e445cb4e40a9aa01af2141487873d2b52ec5f'
node-master_1 | 2016/12/09 13:47:19 [INFO] agent: Synced service '_nomad-client-nomad-client-http'
node-master_1 | 2016/12/09 13:47:19 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-master_1 | 2016/12/09 13:47:19 [INFO] agent: Deregistered check 'd2274318992953a57b186e3b571bff4fe17e6b02'
node-master_1 | 2016/12/09 13:47:20 [INFO] agent: Synced check 'f78cd714d79a0632f32d1680b09cb19650591072'
node-worker_2 | 2016/12/09 13:46:41.966382 [DEBUG] client: starting task context for 'test' (alloc '97623f70-0553-c0f4-2718-211f65a608c3')
node-worker_3 | 2016/12/09 13:46:16.108946 [DEBUG] driver.exec: exec driver is enabled
node-worker_1 | 2016/12/09 13:46:14.551683 [DEBUG] driver.exec: exec driver is enabled
node-worker_2 | 2016/12/09 13:46:41 [DEBUG] plugin: starting plugin: /usr/local/bin/nomad []string{"/usr/local/bin/nomad", "executor", "/tmp/nomad/alloc/97623f70-0553-c0f4-2718-211f65a608c3/test/test-executor.out"}
node-worker_3 | 2016/12/09 13:46:16.109117 [DEBUG] client: available drivers [exec raw_exec]
node-worker_1 | 2016/12/09 13:46:14.551703 [DEBUG] client: available drivers [raw_exec exec]
node-worker_2 | 2016/12/09 13:46:41 [DEBUG] plugin: waiting for RPC address for: /usr/local/bin/nomad
node-worker_3 | 2016/12/09 13:46:16.109479 [DEBUG] client: fingerprinting exec every 15s
node-worker_1 | 2016/12/09 13:46:14.551811 [DEBUG] client: fingerprinting docker every 15s
node-worker_2 | 2016/12/09 13:46:41 [DEBUG] plugin: nomad: 2016/12/09 13:46:41 [DEBUG] plugin: plugin address: unix /tmp/plugin393590857
node-worker_3 | 2016/12/09 13:46:16.109511 [DEBUG] client: fingerprinting docker every 15s
node-worker_1 | 2016/12/09 13:46:14.551863 [DEBUG] client: fingerprinting exec every 15s
node-worker_2 | 2016/12/09 13:46:42.008656 [DEBUG] driver.raw_exec: started process with pid: 53
node-worker_3 | 2016/12/09 13:46:16.109533 [DEBUG] client: fingerprinting rkt every 15s
node-worker_1 | 2016/12/09 13:46:14.555113 [INFO] client: Node ID "b2da178c-e411-aa76-fa8b-b4884fd82ed7"
node-worker_2 | 2016/12/09 13:46:42.222906 [DEBUG] client: updated allocations at index 14 (pulled 0) (filtered 1)
node-worker_3 | 2016/12/09 13:46:16.111698 [INFO] client: Node ID "77bff70c-6e19-14b0-2aba-2a111e74df64"
node-worker_1 | 2016/12/09 13:46:14.558544 [DEBUG] client: updated allocations at index 1 (pulled 0) (filtered 0)
node-worker_2 | 2016/12/09 13:46:42.223185 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 1)
node-worker_3 | 2016/12/09 13:46:16.115786 [DEBUG] client: updated allocations at index 1 (pulled 0) (filtered 0)
node-worker_1 | 2016/12/09 13:46:14.559167 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 0)
node-worker_2 | 2016/12/09 13:46:53.525526 [DEBUG] client: updated allocations at index 17 (pulled 1) (filtered 0)
node-worker_3 | 2016/12/09 13:46:16.115975 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 0)
node-worker_2 | 2016/12/09 13:46:53.529408 [DEBUG] client: allocs: (added 0) (removed 0) (updated 1) (ignore 0)
node-worker_2 | 2016/12/09 13:46:56.496575 [DEBUG] http: Request /v1/agent/servers (385.242µs)
node-worker_1 | 2016/12/09 13:46:14.602584 [INFO] client: node registration complete
node-worker_3 | 2016/12/09 13:46:16.225712 [INFO] client: node registration complete
node-worker_1 | 2016/12/09 13:46:14.602675 [DEBUG] client: periodically checking for node changes at duration 5s
node-worker_3 | 2016/12/09 13:46:16.225891 [DEBUG] client: periodically checking for node changes at duration 5s
node-worker_1 | 2016/12/09 13:46:23.962825 [DEBUG] client: state updated to ready
node-worker_3 | 2016/12/09 13:46:23.557982 [DEBUG] client: state updated to ready
node-worker_1 | 2016/12/09 13:46:26.237898 [DEBUG] http: Request /v1/agent/servers (656.818µs)
node-worker_3 | 2016/12/09 13:46:31.359478 [DEBUG] http: Request /v1/agent/servers (1.998523ms)
node-worker_1 | 2016/12/09 13:46:40.940483 [DEBUG] http: Request /v1/agent/servers (601.585µs)
node-worker_3 | 2016/12/09 13:46:41.955034 [DEBUG] client: updated allocations at index 12 (pulled 1) (filtered 0)
node-worker_1 | 2016/12/09 13:46:41.956297 [DEBUG] client: updated allocations at index 12 (pulled 1) (filtered 0)
node-worker_3 | 2016/12/09 13:46:41.959128 [DEBUG] client: allocs: (added 1) (removed 0) (updated 0) (ignore 0)
node-worker_1 | 2016/12/09 13:46:41.962678 [DEBUG] client: allocs: (added 1) (removed 0) (updated 0) (ignore 0)
node-worker_3 | 2016/12/09 13:46:41.964146 [DEBUG] client: starting task runners for alloc '5c8fdff9-6059-c8b4-90e2-0b0967e2b9c2'
node-worker_1 | 2016/12/09 13:46:41.965021 [DEBUG] client: starting task runners for alloc '9e57df5a-2795-3b64-fe3c-31c2ae24acae'
node-worker_3 | 2016/12/09 13:46:41.964363 [DEBUG] client: starting task context for 'test' (alloc '5c8fdff9-6059-c8b4-90e2-0b0967e2b9c2')
node-worker_1 | 2016/12/09 13:46:41.965158 [DEBUG] client: starting task context for 'test' (alloc '9e57df5a-2795-3b64-fe3c-31c2ae24acae')
node-worker_3 | 2016/12/09 13:46:41 [DEBUG] plugin: starting plugin: /usr/local/bin/nomad []string{"/usr/local/bin/nomad", "executor", "/tmp/nomad/alloc/5c8fdff9-6059-c8b4-90e2-0b0967e2b9c2/test/test-executor.out"}
node-worker_1 | 2016/12/09 13:46:41 [DEBUG] plugin: starting plugin: /usr/local/bin/nomad []string{"/usr/local/bin/nomad", "executor", "/tmp/nomad/alloc/9e57df5a-2795-3b64-fe3c-31c2ae24acae/test/test-executor.out"}
node-worker_3 | 2016/12/09 13:46:41 [DEBUG] plugin: waiting for RPC address for: /usr/local/bin/nomad
node-worker_1 | 2016/12/09 13:46:41 [DEBUG] plugin: waiting for RPC address for: /usr/local/bin/nomad
node-worker_3 | 2016/12/09 13:46:41 [DEBUG] plugin: nomad: 2016/12/09 13:46:41 [DEBUG] plugin: plugin address: unix /tmp/plugin828672124
node-worker_1 | 2016/12/09 13:46:41 [DEBUG] plugin: nomad: 2016/12/09 13:46:41 [DEBUG] plugin: plugin address: unix /tmp/plugin077826786
node-worker_3 | 2016/12/09 13:46:41.995077 [DEBUG] driver.raw_exec: started process with pid: 55
node-worker_1 | 2016/12/09 13:46:42.000928 [DEBUG] driver.raw_exec: started process with pid: 55
node-worker_3 | 2016/12/09 13:46:42.224788 [DEBUG] client: updated allocations at index 14 (pulled 0) (filtered 1)
node-worker_1 | 2016/12/09 13:46:42.223212 [DEBUG] client: updated allocations at index 14 (pulled 0) (filtered 1)
node-worker_3 | 2016/12/09 13:46:42.225486 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 1)
node-worker_1 | 2016/12/09 13:46:42.223610 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 1)
node-worker_3 | 2016/12/09 13:46:53.525947 [DEBUG] client: updated allocations at index 17 (pulled 2) (filtered 0)
node-worker_1 | 2016/12/09 13:46:53.525085 [DEBUG] client: updated allocations at index 17 (pulled 1) (filtered 0)
node-worker_3 | 2016/12/09 13:46:53.527484 [DEBUG] client: allocs: (added 1) (removed 0) (updated 1) (ignore 0)
node-worker_1 | 2016/12/09 13:46:53.530257 [DEBUG] client: allocs: (added 0) (removed 0) (updated 1) (ignore 0)
node-worker_1 | 2016/12/09 13:46:53 [DEBUG] plugin: /usr/local/bin/nomad: plugin process exited
node-worker_1 | 2016/12/09 13:46:53.629011 [DEBUG] client: updated allocations at index 19 (pulled 0) (filtered 1)
node-worker_3 | 2016/12/09 13:46:53.528743 [DEBUG] client: starting task runners for alloc '61531a3d-5536-d8fa-c465-2f2769293aae'
node-worker_1 | 2016/12/09 13:46:53.629197 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 1)
node-worker_3 | 2016/12/09 13:46:53.528979 [DEBUG] client: starting task context for 'test' (alloc '61531a3d-5536-d8fa-c465-2f2769293aae')
node-worker_3 | 2016/12/09 13:46:53 [DEBUG] plugin: starting plugin: /usr/local/bin/nomad []string{"/usr/local/bin/nomad", "executor", "/tmp/nomad/alloc/61531a3d-5536-d8fa-c465-2f2769293aae/test/test-executor.out"}
node-worker_3 | 2016/12/09 13:46:53 [DEBUG] plugin: waiting for RPC address for: /usr/local/bin/nomad
node-worker_3 | 2016/12/09 13:46:53 [DEBUG] plugin: /usr/local/bin/nomad: plugin process exited
node-worker_3 | 2016/12/09 13:46:53 [DEBUG] plugin: nomad: 2016/12/09 13:46:53 [DEBUG] plugin: plugin address: unix /tmp/plugin785637411
node-worker_3 | 2016/12/09 13:46:53.551602 [DEBUG] driver.raw_exec: started process with pid: 76
node-worker_3 | 2016/12/09 13:46:53.802529 [DEBUG] client: updated allocations at index 20 (pulled 0) (filtered 2)
node-worker_3 | 2016/12/09 13:46:53.802634 [DEBUG] client: allocs: (added 0) (removed 0) (updated 0) (ignore 2)
job "test" {
datacenters = ["dc1"]
constraint {
attribute = "${node.unique.id}"
value = "300d9df9|77bff70c"
operator = "regexp"
}
type = "service"
group "test" {
count = 2
restart {
interval = "5m"
attempts = 20
delay = "10s"
mode = "delay"
}
task "test" {
driver = "raw_exec"
config {
command= "/bin/sleep"
args = ["1000"]
}
resources {
cpu = 100
memory = 100
// disk = 110
network {
mbits = 1
}
}
}
}
}
@kaskavalci I couldn't reproduce this.
I created two clients and a server on the same node. I used your job file and set the count to 3 and then decremented it to 2, I saw only one alloc move to stopped state.
Can you try a couple of times? It is sort of sporadic but pretty often. I will also share my setup with you through gitter.
@kaskavalci Hey I understand why this is happening now and it isn't really a bug but a side effect of the systems design. When you scale down that will cause the allocation with the highest alloc index to be destroyed. So even though you are targeting a particular set of nodes to keep it may still cause a destroy.
To be clear this is job.taskgroup[0-count-1]. This is so that the set of allocations that exists are between 0 and count - 1.
Hi @dadgar thanks for looking into it. Is it possible to respect the new constraint without disturbing others? Another follow up question, is it always one alloc will be restarted in such case? Could it be a case where almost all allocs are getting restarted?
@kaskavalci It is very much possible for all allocs to be affected. All depends on the nodes that get to stay and their alloc indexes.
It is something we have been thinking about as we want to bring life-cycle hooks to allocations but it is a pretty core part of the schedulers design and there are nice side effects for users. They can use the alloc index for leader election/sharding/coordination etc. So for now there isn't really a way to kill particular allocations like that.
Can you describe your use case?
@dadgar I see. We want user to be able to kill all allocations on a given node or add/remove nodes to jobs without doing a complete stop -> start cycle. When dealing with stateful jobs with large data in memory, you don't want to restart them with no reason. User should be able to scale down for maintenance or turning off a node to save energy without disrupting the cluster.
@kaskavalci Have you seen nomad node-drain? We support that case. The scaling down while killing particular allocation is still a WIP
Is it possible to do it via HTTP API?
@kaskavalci https://www.nomadproject.io/docs/http/node.html#put-post
I am going to close this issue since it is not a "bug" and the reason has been explained. This is not to say we aren't interested in the follow up discussion that was had. It is part of the greater life cycle control and is important to us.
@dadgar what about triggering node-drain affect on the other allocations? Could we expect restarts as with scaling down? Let's assume count = 2 and job is running on Node1 and Node2 as shown below:
Node1 [ x ]
Node2 [ x ]
Node3 [ ]
Node2. Job migrates to Node3Node1 [ x ] <-- can we guarantee that this alloc will not be effected?
Node2 [ ]
Node3 [ x ]
@kaskavalci You can't guarantee that the new alloc doesn't get placed on Node1 unless you use the distinct_hosts constraint but the original allocation(s) on Node1 will not be effected.