Kops: Feature request: commands to stop clusters

Created on 12 Dec 2016 · 13Comments · Source: kubernetes/kops

To save money, it would be nice to stop the cluster when I'm not using it. When I manually terminated all my instances, the autoscaling groups brought everything back, and the cluster seemed to be working (API was working, guestbook app deployed to the cluster was working), so it seems like it wouldn't be a stability issue to stop and start the cluster.

Right now, to actually stop it, I have to edit all the autoscaling groups to have Min and Desired values of 0. Then I could probably restart things by resetting those values to whatever the Max values are for each ASG.

This could be implemented by having kops stop cluster set the values to 0 directly in the IaaS ASGs. kops update cluster will bring things back on its own because it will converge the IaaS ASG settings to the desired state configured in kops edit instancegroup:

$ kops update cluster
Using cluster from kubectl context: REDACTED

I1211 15:33:22.519823   61467 executor.go:68] Tasks: 0 done / 65 total; 30 can run
I1211 15:33:23.904160   61467 executor.go:68] Tasks: 30 done / 65 total; 13 can run
I1211 15:33:24.280864   61467 executor.go:68] Tasks: 43 done / 65 total; 18 can run
I1211 15:33:25.102993   61467 executor.go:68] Tasks: 61 done / 65 total; 4 can run
I1211 15:33:25.325865   61467 executor.go:68] Tasks: 65 done / 65 total; 0 can run
Will modify resources:
  ManagedFile           managedFile/REDACTED-addons-bootstrap
    Contents                 <resource> -> <resource>

  LaunchConfiguration   launchConfiguration/master-us-east-1b.masters.REDACTED
    UserData                 <resource> -> <resource>

  LaunchConfiguration   launchConfiguration/master-us-east-1a.masters.REDACTED
    UserData                 <resource> -> <resource>

  LaunchConfiguration   launchConfiguration/nodes.REDACTED
    UserData                 <resource> -> <resource>

  LaunchConfiguration   launchConfiguration/master-us-east-1d.masters.REDACTED
    UserData                 <resource> -> <resource>

  AutoscalingGroup      autoscalingGroup/nodes.REDACTED
    MinSize                  0 -> 8

  AutoscalingGroup      autoscalingGroup/master-us-east-1b.masters.REDACTED
    MinSize                  0 -> 1

  AutoscalingGroup      autoscalingGroup/master-us-east-1d.masters.REDACTED
    MinSize                  0 -> 1

  AutoscalingGroup      autoscalingGroup/master-us-east-1a.masters.REDACTED
    MinSize                  0 -> 1

(The above is all AWS-specific, but would presumably work similarly on GCP).

Source

amitkgupta

👍4

Most helpful comment

I have done this by following this .
Step # 1 Get Nodes & Master Result for Group Names
root@worker:~# kops get ig
Using cluster from kubectl context: cluster.#######.local

NAME ROLE MACHINETYPE MIN MAX SUBNETS
master-us-east-1d Master m3.medium 1 1 us-east-1d
nodes Node t2.medium 0 0 us-east-1d

Step # 2 (Edit Master Count on S3 Bucket Set Min & Max Value to 0)
root@worker:~# kops edit ig master-us-east-1d
maxSize: 0
minSize: 0

Step # 3 (Set Nodes Min & Max to 0)
root@worker:~# kops edit ig nodes
maxSize: 0
minSize: 0

Step # 4 (Update the Cluster)
root@worker:~# kops update cluster --yes

Step # 5 (Roll your updates on cluster)
root@worker:~# kops rolling-update cluster

No need to go to AWS to stop all resource manually :)

mansurali901 on 3 Oct 2017

👍11

All 13 comments

Sounds great. We have a edit feature, but I am uncertain how it would impact instance groups. How about an edit command where you feed it the yaml that sets the ig to zero?

chrislovecnm on 14 Dec 2016

@chrislovecnm could you clarify what you're suggesting? A new edit command, or to just use the existing edit command, or to enhance the existing edit command so you can feed it a YAML file rather than interactively editing the IG config?

One UX issue with editing the IG is then you have to explicitly set it back, which also means remembering what the original desired min/max/desired instance count numbers were.

Amit-PivotalLabs on 14 Dec 2016

lets start with a use case. Can you write a short description of what you need?

chrislovecnm on 17 Dec 2016

I'm experimenting with kops/kubernetes. On the weekends, I want to spin up my cluster and play around with it, let some of my colleagues use it (eventually), etc. During the week I want to shut it down. I'd like a straightforward way to shut things down so that I'm not paying for anything during the week, and a straightforward way to bring everything back up on the weekend. It's alright to lose data for any pods I have running that don't have persistent volumes, but the data in etcd about what pods/rcs/etc should be running, that data should not be lost.

Amit-PivotalLabs on 18 Dec 2016

The easy way to do this, FYI, is to resize the ASGs in AWS but _not_ in kops (setting it to size 0). Normally that is the wrong thing to do (you should work with kops, not behind its back), but in this case it means when you want to resume the cluster you just run kops update cluster --yes and it will restore the correct ASG sizes

justinsb on 28 Dec 2016

👍2

Yes, I do this by using kops to generate terraform, and I stop the cluster
by modifying the terraform template to 0 instance counts on the ASGs. To
restart I re-run "update cluster" and "terraform apply". I use "terraform
plan" to confirm the only changes are min/max sizes in the ASGs.

Since I need to do other post-processing on the terraform templates I think
I actually like doing this myself in terraform and not as a feature in kops
itself.

On Tue, Dec 27, 2016 at 10:24 PM Justin Santa Barbara <
[email protected]> wrote:

The easy way to do this, FYI, is to resize the ASGs in AWS but not in
kops (setting it to size 0). Normally that is the wrong thing to do (you
should work with kops, not behind its back), but in this case it means when
you want to resume the cluster you just run kops update cluster --yes and
it will restore the correct ASG sizes

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kops/issues/1132#issuecomment-269432267,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AC9t3qcHVORWm4HVmEy3iE5dTNA4rKZ4ks5rMgCDgaJpZM4LKE4r
.

Amit-PivotalLabs on 28 Dec 2016

Are there any plans to take this as a feature in kops? We have similar issue and would like to stop the cluster, not delete the instances etc within it.

beanspatil on 5 May 2017

Hi Amit Pivotal Labs, Can you please let me know how you are even able to stop the instances. When i tried changing the instance counts in the AutoScaling Group , the instances were terminated by the kubernetes itself. I am not able to get them stopped or detached from the ASG .

beanspatil on 5 May 2017

Hi @beanspatil I don't believe kubernetes terminates instances.

I don't use kops to update my cluster directly, since I need to make custom changes to the terraform configs:

$ [ENV_VARS...] /path/to/kops update cluster \
  --target=terraform \
  --out=. \
  [other_flags...]

To do anything to the cluster, I first generate the terraform configs as above, programmatically modify them, then terraform apply.

To stop the cluster, part of my programmatic modification is to set the min_size and max_size to 0 in all instance groups, then terraform apply.

Amit-PivotalLabs on 5 May 2017

Hi Amit,

Thanks for your reply. The modification of min_size and max_size terminates
the instances which is not desirable. We need a way to stop the instances
and start when required.

Thanks
Beena

On Fri, May 5, 2017 at 10:09 PM, Amit Gupta notifications@github.com
wrote:

Hi @beanspatil https://github.com/beanspatil I don't believe kubernetes
terminates instances.

I don't use kops to update my cluster directly, since I need to make
custom changes to the terraform configs:

$ [ENV_VARS...] /path/to/kops update cluster \
--target=terraform \
--out=. \
[other_flags...]

To do anything to the cluster, I first generate the terraform configs as
above, programmatically modify them, then terraform apply.

To stop the cluster, part of my programmatic modification is to set the
min_size and max_size to 0 in all instance groups, then terraform apply.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/kubernetes/kops/issues/1132#issuecomment-299514407,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFayd2KkOqNN281NAt8R853upQecSGpPks5r21DRgaJpZM4LKE4r
.

beanspatil on 6 May 2017

@beanspatil can you explain why it isn't desirable to terminate the instances? When you scale back up, everything should recover again. I guess you lose any data in local storage or pods, but that data is ephemeral anyway.

justinsb on 8 May 2017

I have done this by following this .
Step # 1 Get Nodes & Master Result for Group Names
root@worker:~# kops get ig
Using cluster from kubectl context: cluster.#######.local

NAME ROLE MACHINETYPE MIN MAX SUBNETS
master-us-east-1d Master m3.medium 1 1 us-east-1d
nodes Node t2.medium 0 0 us-east-1d

Step # 2 (Edit Master Count on S3 Bucket Set Min & Max Value to 0)
root@worker:~# kops edit ig master-us-east-1d
maxSize: 0
minSize: 0

Step # 3 (Set Nodes Min & Max to 0)
root@worker:~# kops edit ig nodes
maxSize: 0
minSize: 0

Step # 4 (Update the Cluster)
root@worker:~# kops update cluster --yes

Step # 5 (Roll your updates on cluster)
root@worker:~# kops rolling-update cluster

No need to go to AWS to stop all resource manually :)

mansurali901 on 3 Oct 2017

👍11

I still believe kops stop cluster command is helpful because with ASG way we need to set the counts manually, so essentially we are changing the configuration not the state.

Now what if somebody had put some thought on minSize and maxSize numbers, somebody else now changes the counts to 0 to stop the cluster, and now some third person is trying to start the cluster back on. Now (s)he doesn't have any information of what to put as minSize and maxSize.

Another thing is changing the counts to 0 will make the instances terminated, better they are stopped because creating instance would take much time as compared to starting.

Also, we may have added some configuration to the instance and terminating them would mean to redo everything.

akshayvadher on 13 Mar 2019

👍5

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Go panic when deleting IG

drewfisher314 · 4Comments

Confusion with the AWS Route53 readme

yetanotherchris · 3Comments

Allow opt-in to etcd3

justinsb · 4Comments

error: error validating "cluster-autoscaler.yml": error validating data: found invalid field tolerations for v1.PodSpec; if you choose to ignore these errors, turn validation off with --validate=false

endejoli · 4Comments

Support for newer docker versions like 18.03

lnformer · 3Comments