Nomad: Support for auto scaling

Created on 25 Jan 2016 · 18Comments · Source: hashicorp/nomad

Do you have any plans for adding auto-scaling to Nomad? Ideally I would like to be able to setup both services and clients to be auto-scaled.

For auto-scaling of services it would be nice to have the ability scale based on custom metrics however just scaling based on CPU and memory would probably be enough at first. Scaling of clients (unless I am missing something) would likely be more difficult as you would need to scale based on the capacity of each client not just the utilization.

typquestion

Source

dancannon

👍14

Most helpful comment

Any further updates on this issue / best practices for auto-scaling? It's been a while since this has been active.

EDIT: I guess I should ask two questions:

Are there any best practices for auto-scaling servers and clients?
Are there any best practices for auto-scaling tasks?

I'm actually more interested in the latter.

Xopherus on 21 Feb 2018

👍6

All 18 comments

@dancannon Nomad is going to definitely have autoscaling, but I think this is something that will be built into Atlas.

So if you are using the Atlas integration with Nomad, Atlas will be able to scale up your infrastructure when jobs needs more compute, disk or network resources. So you will be able to specify terraform scripts which would be used to scale up your cluster when Nomad needs more machines, and the autoscaler would also remove nodes when they are not needed.

Beyond infrastructure autoscaling, once that lands, there will probably be support for more advanced application autoscaling.

On the OSS side, we are working hard to develop the foundational features of the cluster manager and make it battle hardened.

diptanu on 26 Jan 2016

😕2

@diptanu Is this something that will always be an Atlas feature or a global feature eventually?

sthulb on 26 Jan 2016

@sthulp Not sure what you mean by a global feature. But yeah autoscaling would be an Atlas feature for Nomad.

diptanu on 26 Jan 2016

@diptanu I believe he means is this a feature that will ever come to Nomad or will it be exclusive to Atlas? As a related question would you accept PRs for either of these features?

dancannon on 27 Jan 2016

geovanisouza92 on 26 Feb 2016

Curious about this too, though autoscaling seems a bit out of Nomad's scope. This seems like it would be more in Terraform's domain.

DeepAnchor on 29 Feb 2016

👍1

@DeepAnchor It's more about scaling running jobs. Job A is capped at 0.5CPU and is hitting the limit, ideally you'd want Job A to run a second time somewhere to allow for more requests to succeed.

sthulb on 29 Feb 2016

👍4

Somewhat duplicate: #172.

JensRantil on 1 May 2016

Yeah, we have same use-case as @sthulb where we need to spawn extra instances of a job if it's hitting its resources limits.

ashald on 13 Oct 2016

Does anybody know of any application that achieves this feature e.g. auto-scaling / automatic infrastructure launch based on key metrics? This is a big missing link in the _DevOps Chain_, somewhere between Nomad and Terraform.

Currently we have to rely on the cloud provider autoscaling services, but this is too basic e.g. CPU, Mem, and locking to that provider.

oryband on 28 Nov 2016

👍1

@oryband

Currently we have to rely on the cloud provider autoscaling services, but this is too basic e.g. CPU, Mem, and locking to that provider.

In AWS/GCP you can base autoscale on your own metrics without any vendor lock:

dmitrypa on 28 Nov 2016

Service autoscaling will be sufficient, scaling nomad itself is about platform (AWS/GCP/OSS) and configuration manager and some magic (descaling, killing services, ...). Getting information from metrics and send events to scale/descale is something what Kubernetes and Mesos using. Kube can use information from prometheus/cadvisor metrics.

But I think that this funcionality will take Nomad from "simple" solution to something what we probably don't want.

Whats really missing is something like mesosphere universe, rancher catalog. As right now is very uneasy to get proper examples, use cases and setup basics application.

VAdamec on 20 Jan 2017

How can I get the metrics needed for auto scaling out of Nomad?

JensRantil on 22 Jan 2017

@JensRantil You will likely want to decide based on the client's metrics: https://www.nomadproject.io/docs/agent/telemetry.html

dadgar on 22 Jan 2017

👍5

Any further updates on this issue / best practices for auto-scaling? It's been a while since this has been active.

EDIT: I guess I should ask two questions:

Are there any best practices for auto-scaling servers and clients?
Are there any best practices for auto-scaling tasks?

I'm actually more interested in the latter.

Xopherus on 21 Feb 2018

👍6

@Xopherus
Until native autoscaling capability may/may not land would some third party software for Nomad help?

https://github.com/jippi/awesome-nomad

One of them which comes to mind is "replicator"

shantanugadgil on 22 Feb 2018

👍5

I'd also like to know what current best practices exist to scale (up/down) underlying node capacity in relation to scheduled job load. I've found a third party auto-scaler https://api.spotinst.com/container-management/nomad/nomad-autoscaling-concepts/ but it would be awesome if Nomad/HashiCorp had a canonical recommendation (or suggested approach) for this.

corford on 1 Sep 2019

@corford AFAIK SpotInst is a service and not a hosted solution.
Did you get a chance to try out SpotInst's ElastiGroup to solve your requirements?

Many a times organizations are weary of "additional" services and in case you want something "self maintained", which would provide a way for scaling nodes/services, then what comes to mind is Sherpa:
https://github.com/jrasell/sherpa

I am keeping an eye on Sherpa myself, as I like to setup things myself and rely very less on Cloud provider specific services. 😁

shantanugadgil on 9 Sep 2019

👍1

Was this page helpful?

0 / 5 - 0 ratings