Kops: Does it broken service during upgrade cluster?

Created on 6 Dec 2018 · 7Comments · Source: kubernetes/kops

Hello,

I have several k8s clusters on production environment which were setup via kops, includes v1.9.8, 1.10.3, etc.

Recently there is bug (CVE-2018-1002105), I plan to upgrade it via kops, So my question is does it broken our services running on them during upgrade process?

what is the policy of upgrade?

Thanks!

Source

wwyhy

Most helpful comment

@wwyhy
Upgrades are in General no Problem. I have one production Cluster with currently ~100 active customer websites deployed. To make sure your "services" are always reachable make sure you deployed them with at least two replicas. Kops updates per default one node at a time, so the workload are moved (Drain node) before the upgrade happening. Normally the Most pods are deployed on an other node after a few seconds of downtime (if not alredy replicated). To perform an Cluster upgrade I normally do the following:

Backup

Create a backup with ark (ark backup create BACKUPNAME)
Wait to Finish (ark backup describe BACKUPNAME + Watch EBS Snapshots)

Update state

kops upgrade cluster followed by an kops update cluster, eventually edit manually the Kubernetes Version with kops edit cluster

Apply changes to State-store / AWS

kops update cluster --yes

Rolling Update

Currently I had my own strategy for the Rolling-update, used on the last maybe 10 updates of my production cluster. I first only update the master instance group, watch for Cluster validation and maybe fix some errors. If the upgrade not working I can do a Rollback (kops edit Cluster, change Version, etc.) and test again.

The benefit of this solution is that the complete workload (all running nodes and Pods) are not affected from the master upgrade. So I can be sure, that the master is running correctly before I upgrade the Nodes.

kops rolling-update cluster --instance-group=MASTER-IG

Apply:

kops rolling-update cluster --instance-group=MASTER-IG --yes

Update Nodes

Now I am be Safe to make an "regular" rolling-update without a specified instance-group

dhemeier on 6 Dec 2018

👍3

All 7 comments

@wwyhy
It depends on what kind of service running in your cluster?
The most dangers part I think is moving stateful deployment with ebs attached from one to another.

The rolling-update policy is simply close the old one and start a new one with new configuration.

cychiang on 6 Dec 2018

it is one by one (node and master) or close all and start all?

wwyhy on 6 Dec 2018

@wwyhy
It's one by one and start from master.

cychiang on 6 Dec 2018

I have upgrade cluster from 1.9 to 1.10 and the service is not affected.
but you must make sure you replica is at least 2.

rj03hou on 6 Dec 2018

Backup

Create a backup with ark (ark backup create BACKUPNAME)
Wait to Finish (ark backup describe BACKUPNAME + Watch EBS Snapshots)

Update state

kops upgrade cluster followed by an kops update cluster, eventually edit manually the Kubernetes Version with kops edit cluster

Apply changes to State-store / AWS

kops update cluster --yes

Rolling Update

kops rolling-update cluster --instance-group=MASTER-IG

Apply:

kops rolling-update cluster --instance-group=MASTER-IG --yes

Update Nodes

Now I am be Safe to make an "regular" rolling-update without a specified instance-group

dhemeier on 6 Dec 2018

👍3

Great responses everyone! I'm going to close this as it seems resolved. @wwyhy please feel to ping any of us if you have any questions!

/close

mikesplain on 10 Dec 2018

@mikesplain: Closing this issue.

In response to this:

Great responses everyone! I'm going to close this as it seems resolved. @wwyhy please feel to ping any of us if you have any questions!

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.