Nomad: Output of "nomad run" seems wrong for system job with constraints.

Created on 1 Mar 2017 · 15Comments · Source: hashicorp/nomad

Reference: https://groups.google.com/forum/#!topic/nomad-tool/t3bFTwSVgdQ

Nomad version

Nomad v0.5.4

Issue

When we run a system job, a placement failure error is raised (status code: 2) if one node has been exclude by a constraint. So it's impossible to know programmatically if at least one allocation was successfully placed or not.
The output displays wrong counts (see below) for filtered nodes.

Quoted from the mailing list:

We have a system job that runs on an auto scaling group (on AWS).
The instances of this group have a nomad class "foo" so the job definition is like:

job "test" {
datacenters = ["dc1"]
type = "system"

constraint {
    attribute = "${node.class}"
    value     = "foo"
}

[...]
}

So the job will be deployed on all servers in the autoscaling group and if we scale up the group,
Nomad automatically deploys the job on the newly instantiated server.

It's really cool but at the job submission, we have strange output.

Here is our (simplified) cluster nodes:

A: Instance with class="bar

B: Instance with class="bar"

C: Instance with class="baz"
Autoscaling group:

D1: Instance with class="foo"

When we run the job above we have the following output:

==> Monitoring evaluation "d1e000cd"
Evaluation triggered by job "test"
Allocation "51b3d960" modified: node "a45700d3", group "test"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "d1e000cd" finished with status "complete" but failed to place all allocations:
Task Group "test" (failed to place 3 allocations):
* Class "bar" filtered 1 nodes
* Constraint "${node.class} = foo" filtered 1 nodes

I think it's because a system job has only one evaluation but these numbers are weird:

Class "bar" really filtered 2 nodes

Constraint node.class filtered 3 nodes (or indeed 1 if we subtract the previous line)

The output contains a specific line for class "bar" but not class "baz" ? (It's pretty weird)

And, our main problem is that the status code of the "nomad run" command is 2.

themcli typenhancement

Source

cyrilgdn

👍8

Most helpful comment

Same thing here... Running Nomad 0.7.1, whenever I use constraints with a system job in the same workflow as described in this issue I get placement errors even though the allocations are successful. It's like Nomad is treating a constrained node as a failure placement on system jobs when actually it is not!

danlsgiga on 19 Mar 2018

👍2

All 15 comments

+1 to this. The non-zero exist status is the real issue for us.

dansteen on 8 Jun 2017

I'm running into this problem on our Nomad clusters at Density with Nomad 0.7. Our CI/CD pipeline attempts to plan and run jobs via the Nomad API and reports failure with system jobs. The Nomad CLI's exit code 2 appears to reflect the failed allocations coming back from the API:

https://github.com/hashicorp/nomad/blob/6a783e9574e2203d1780f4e5b211f3ba766f812a/command/plan.go#L184-L186

https://github.com/hashicorp/nomad/blob/6a783e9574e2203d1780f4e5b211f3ba766f812a/command/monitor.go#L317-L321

I'd be happy to contribute a fix for this, but it's not totally clear what the correct behavior should be. Should there simply be more exit codes to reflect different kinds of warnings?

tgross on 23 Feb 2018

@dadgar any updates regarding this issue? Encountering the same issue with Nomad 0.7.1

stefan-caraiman on 28 Feb 2018

We are experiencing this as well.
Server: Nomad v0.7.0-rc3
Client: Nomad v0.7.1 (0b295d399d00199cfab4621566babd25987ba06e)

gerilya on 6 Mar 2018

a "quick" work-around is to submit it over the HTTP API rather than CLI and inspect the evaluation your self

i would expect any placement due to lack of resources for a system job to fail like it does today though

jippi on 6 Mar 2018

I ran into the same issue today as well, it looks like this is more then an exit-code issue. The scheduler reports failed allocations over HTTP API as well. (so you get the same behaviour submitting over HTTP). The allocations do get scheduled properly, but it reports the filtered nodes as failed allocations.

curl -s localhost:4646/v1/evaluation/5d16340b-1ac6-625f-db46-b59d5f8534d6 | jq -r .
{
  "ID": "5d16340b-1ac6-625f-db46-b59d5f8534d6",
  "Type": "system",
  "TriggeredBy": "job-register",
  "JobID": "foo",
...
  "FailedTGAllocs": {
    "tg-foo": {
      "NodesEvaluated": 1,
      "NodesFiltered": 1,
      "NodesAvailable": {
        "zone2": 14,
        "zone3": 14,
        "zone1": 14
      },
      "ClassFiltered": {
        "class-a": 1
      },
      "ConstraintFiltered": {
        "${node.class} = class-b": 1
      },
      "NodesExhausted": 0,
      "ClassExhausted": null,
      "DimensionExhausted": null,
      "QuotaExhausted": null,
      "Scores": null,
      "AllocationTime": 30605,
      "CoalescedFailures": 38
    }
  },
...
}

Crypto89 on 7 Mar 2018

danlsgiga on 19 Mar 2018

👍2

For the record, this error still appears on 0.8.1.

Example code: https://pastebin.com/raw/f7yH5Q4U

SomKen on 26 Apr 2018

😕3

Follow up,

After doing a fresh install of Nomad server, running the same job above, no errors exist in the UI. Errors sill persist when running the job via CLI.

SomKen on 26 Apr 2018

I've just run into this as well. I launch my jobs from ansible, and now I have to tell ansible that exit code 2 is OK, which is .. sub optimal.

dmartin-isp on 11 Sep 2018

same with 0.8.4, but errcode 1...
So constrain don't work in system jobs truth CI for me

subvillion on 2 Nov 2018

Come on guys, this is really a bug and should be dealt with. Many, if not most, people that run a service are going to constrain it to a subset of nodes. Having it throw an error for such a common use case isn't good. Here's my workaround/backflip to take care of this in Ansible. At least it'll let some errors get trapped.

failed_when: 'jobresult.rc != 0 and not jobresult.stdout.find("finished with status \"complete\" but failed to place all allocations:") > -1'

jsaintro on 14 Nov 2018

👍1

Any news on this?

SomKen on 15 Jan 2019

@SomKen @jsaintro we will tackle this in the next minor release (0.9.1) after 0.9 is out. Sorry for the delay but our highest priority now is to finish the large 0.9 release which brings in GPU support, runtime plugins and more advanced scheduling improvements.