Out of the box kops creates an ASG on AWS for the nodes instance group, with a min and max number of instances. But it doesn't add any Scaling Policies, so the ASG will not grow above the min, unless someone manually changes it.
Does kops have the ability to define Scaling Policies? I couldn't find any documentation on defining these, so perhaps we just need documentation. But if it's not currently supported, then this is a feature request.
Scaling is managed by cluster-autoscaler addon instead ASG policies https://github.com/kubernetes/kops/tree/master/addons/cluster-autoscaler
Thanks for the info. I was unaware of the cluster-autoscaler add-on. After looking at it, it appears that you must manually set several cluster specific parameters, which are already maintained in the kops config on S3. It doesn't seem logical that we should be maintaining a min and max size of the ASG and the cluster-autoscaler add-on separately. As such, I think that kops should install and manage the configuration of this add-on by default.
@ese is right - we don't install the AWS autoscaling policies because the autoscaler is the more "correct" solution. The reason is that the kubernetes scheduler will avoid scheduling too many pods, so you might never see excess CPU, even if you cluster had hundreds of unschedulable pods.
I agree that kops should integrate autoscaler. I think likely first we should make a few changes in the autoscaler to pick up our ASGs automatically as well. I'm hoping it can be a simple addon at that point, and then I don't particularly mind whether we install it automatically or not. My inclination up until now has been not to unless it is critical to bring-up, but I also want to make the distinction go away by making it easy to add addons to the kops installation (and most of the machinery is there to do that already, to support things like weave addons)
(I changed the title to incorporate the cluster-autoscaler option - hope that is OK!)
Sure changing the title makes total sense. And it sounds like the plan is to build an easy to include certain important addons, so that would certainly satisfy my request.
In terms of if kops should deploy the cluster-autoscaler option by default or not, I think this comes down to style. The main README.md states "kops lets you deploy production-grade, highly available, Kubernetes clusters from the command line." Personally, I would hope that it does this by default out of the box, without the need to flip levers to ensure that it happens. But I can also see why others might just want it to just see a bare minimum cluster created, so I think it comes down to style. Addons such as Heapster and the Dashboard aren't required, so I can see why they would be off by default.
If it's not included, we should make sure that the documentation is crystal clear that this needs to be added in all production deployments to enable autoscaling.
Is there any update on this issue ? on k8s 1.6.0
What was the final word on this? Is the add-on getting installed by default with kops? It seems not. I would put my vote to have it installed by default and a create cluster parameter available to set max and min nodes.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
This particular feature would be nice so that we're able to scale down our cluster outside of business hours in our test environments. I don't think the cluster auto scaler supports this.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/remove-lifecycle stale
Seems a bit misleading to provide the option of maxSize and minSize without actually doing any scaling.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
Thanks for the info. I was unaware of the
cluster-autoscaleradd-on. After looking at it, it appears that you must manually set several cluster specific parameters, which are already maintained in thekopsconfig on S3. It doesn't seem logical that we should be maintaining a min and max size of the ASG and thecluster-autoscaleradd-on separately. As such, I think thatkopsshould install and manage the configuration of this add-on by default.