Nomad: [Feature] Ability to have nodes which can be used only for job with specific constraint.

Created on 9 Feb 2017  路  21Comments  路  Source: hashicorp/nomad

Reference: https://groups.google.com/forum/#!topic/nomad-tool/Nmv8LiMUnEg

It would be great to have a way to avoid jobs to be run on a node unless they specify a constraint!

Quoted from the mailing list discussion:

Here is our (simplified) case:

We have 3 servers A, B and C and we want on C only specific jobs.

Currently we use class on nomad nodes for that:
A: class = "foo"
B: class = "foo"
C: class = "bar"

All ours jobs specify constraints like:
constraint {
attribute = "${node.class}"
value = "foo" # Or "bar" if must be deployed on C
}

It works but it's a bit constraining as there's only a few jobs which must be run on C but we have to put constraints on all.
Furthermore, we want (in the near future) our developers to be able to write their own jobs, and don't want these jobs to be deployed on the wrong server if they forgot the constraint.

Thanks !

stagaccepted themcore themscheduling typenhancement

Most helpful comment

Agreed this should remain open. I think another compelling use case is to actually suggest people allow Nomad servers to be clients as well. Then using this feature you could ensure the servers aren't considered for the vast majority of workloads, but you could still use system jobs for log shippers and monitoring tools on the servers.

All 21 comments

Why do you need to put node.class constraints on all your Job/TaskGroup? That means a Job/TaskGroup can only run on "bar" or "foo", never either. Is that your intention?

@dvusboy We want no jobs to be run on C except with a specified constraint.

Currently, we can tell to some jobs to go specifically on C with the constraint "${node.class}" = "bar" but all the jobs without any constraint could be run on A, B or C. We want them to be run only on A or B, so we have to put the constraint "${node.class}" = "foo" on them.

Or ${node.class} != "bar". It sounds like you want something as a config at the Nomad client level, restricting it to only allow tasks with the constraint ${node.class} == "bar"

@dvusboy That's what I want indeed.

This would be a very helpful feature. 馃憤

I agree, this would be a very nice feature.

We have a redundant nomad setup, but a few nodes are highly special (different network config and similar, basically bridging our environment to other environments)

If we could setup a default constraint on those special nodes (only jobs with class "bridge-node" for example) we could make sure that no one deploys anything on those nodes unless they really mean to.

Ideally this should also be protected through vault integration so we could limit who is allowed to deploy to these network-bridging-nodes.

The only solution right now seems to be to setup a dedicated consul / nomad cluster for these nodes to make it hard to do mistakes.

This would be ideal. Here's what I've run into. I've got a handful of specialized build servers in my cluster. These servers are really big machines 16+ core and a ton of ram additionally they have a specific set of hardware which supports a specfic set of cpu instructions that I'm using to build and test a suite of different applications optimized for machines with these instruction sets. Ideally id like for only jobs tagged with a very specific constraint to deploy onto those machines. I cant trust that the other developers i give access to deploy jobs through deployment tools that interface with nomad to deploy tasks are always going to add a constraint to their jobs so that they dont get scheduled onto my specialized machines. Something has to be there so that nodes can enforce a set of node specific rules on which jobs will get scheduled to them when job constraints arent defined in a job definition.

Would be very useful for us too.
We have tons of .nomad files and we wouldn't like to modify them all. Would be great to add some constraint/flag on client side.

any news on this feature? Seems like there are different good use cases. I'm interested in how other people are solving this on their clusters.

This is the first I've read this request and it is interesting to me Nomad doesn't really have a way to disable placements on a node _by default_ except for jobs which explicitly target it.

You could use datacenters today to achieve this. If your normal datacenter is dc1, you could put the disable-by-default nodes in dc1-foo so that only job's that explicitly state they want to be schedule in datacenter dc1-foo actually end up on those nodes.

It's a little hacky but may be easier than trying to enforce proper constraints on every job.

@schmichael I like this workaround, thanks for the idea!

Looks like in v0.9 you could manage jobs placement via affinities:
https://www.nomadproject.io/guides/advanced-scheduling/affinity.html

Affinities is in 0.9 beta and will be in the final release coming soon, closing this

@preetapan, I think we should reopen the issue.
Affinities can't manage placement from the point of nomad client view.
So affinities behave here like more flexible constraints.
But if you would like to have nodes for an only particular type of jobs you still have to edit all nomad files in a cluster and add affinities/anti-affinities/constraints to nomad files to avoid jobs to be placed on that nodes.
It's is quite hard to do in clusters with hundreds or thousands of jobs.

I would also opt to have this re-opened, since affinities are soft. If I understood correctly, this issue is about a hard, cluster-wide, implied constraint.

@preetapan @schmichael what do you think about reopenning the issue?

How is the datacenter workaround insufficient? https://github.com/hashicorp/nomad/issues/2299#issuecomment-459745159

At first it just unobvious and as you notice, it's the workaround.
But real problems come when you indeed use multiple datacenters.
For instance, we're grouping all our alerts by datacenter. The workaround breaks this ability.
Another case is about service catalog and service discovery. When our services communicate with others, they prefer local services.
With this workaround, we can't do automatically service discovery, and we should add exceptions and special policies. It's is quite hard when you have more than one fake datacener.

Agreed this should remain open. I think another compelling use case is to actually suggest people allow Nomad servers to be clients as well. Then using this feature you could ensure the servers aren't considered for the vast majority of workloads, but you could still use system jobs for log shippers and monitoring tools on the servers.

Another use case: access to restricted data.

I have a redis cluster that can only be accessed by certain services. Access to redis is controlled by iptables.

I need to have a subset of my nomad agents that have iptables rules allowing access to that redis cluster.

Option1: new datacenter

I鈥檇 rather not use the datacenter workaround because I鈥檓 already using datacenter primitive for my 2 co-located DCs. There is a 1:1 mapping between consul dc and nomad dc. Deviating from that will require training developers about the exception.

Option2: consul-connect + envoy

Theoretically Envoy would be a viable option. I could use consul Acls to restrict access to the redis cluster to certain workloads.
Unfortunately redis in cluster mode requires one of the following

  • cluster aware libraries
  • cluster aware proxy (like the Corvus project)

I鈥檝e tried and it鈥檚 not possible to use a proxy with envoy. There is experimental support for native redis support in envoy, but it doesn鈥檛 work with the traditional redis clustering I use.

Taking a look at this - thanks all for the input.

Was this page helpful?
0 / 5 - 0 ratings