Kubeadm: e2e test the kubeadm v1.6->v1.7 upgrade route

Created on 17 Jun 2017 · 17Comments · Source: kubernetes/kubeadm

The title says it all.
This is probably a little harder one to figure out with kubernetes-anywhere, but I'd love to get it sorted out.

Spawn a release-1.6 cluster
Make sure all nodes joined correctly (all nodes in the Ready state)
Upgrade to release-1.7
Bonus: Test out kubeadm join again if we can
Run all e2e tests
Verify that the Node Authorizer is working after upgrade as well: https://github.com/kubernetes/kubeadm/issues/310

@pipejakob How does this sound? Easy/hard to implement? How are the other upgrade suites working?

aretesting help wanted prioritimportant-soon

Source

luxas

Most helpful comment

Have PR to get the upgrade tests working: https://github.com/kubernetes/test-infra/pull/4732

Meanwhile, I will manually test the route to ensure we get some feedback for 1.8 release. Tracked by: https://github.com/kubernetes/kubeadm/issues/466

jessicaochen on 26 Sep 2017

👍2

All 17 comments

There are a wide variety of upgrade test suites: some upgrade just the master and run e2es, some run new e2es against old clusters (to verify compatibility), some run old e2es against newer clusters (testing compatibility in the other direction), and the newest ones (that @krousey added) run tests while upgrading the cluster.

I think that the last set will be the most difficult to port to kubernetes-anywhere, as I believe that they rely on the fact that we drain nodes before removing them from the cluster during an upgrade.

As an aside, if we could implement this automated test, it would pave the way for us to start replacing kube-up with kubernetes-anywhere+kubeadm in our upgrade test suites, which would be awesome.

roberthbailey on 20 Jun 2017

@roberthbailey The testing logic that relies on draining is guarded by cloud provider. That is, it's only enabled for GKE and GCE right now.

krousey on 20 Jun 2017

Do the tests run and/or pass on other cloud providers? I'm guessing that we implicitly rely on the draining behavior for successful test runs.

roberthbailey on 20 Jun 2017

I can't speak for how else we rely on GKE or GCE in the upgrade tests, but the upgrade tests do have a mode where they can run on a cluster upgrade that doesn't drain nodes.

krousey on 20 Jun 2017

I can certainly think of dirty ways to implement this in kubernetes-anywhere, but nothing particularly elegant comes to mind. The most flexible thing I can think of is adding a pseudo-phase or subcommand that is capable of running arbitrary commands on the master or nodes, which would have different implementations based on the phase1 provider. For GCE (the only cloud provider used by our current e2es), this would just run gcloud compute ssh $HOST --command $COMMAND, and I assume equivalents exist for AWS/Azure.

Then, we can run through the normal installation as-is, but after verification finishes, invoke the subcommand to push CLI invocations to the hosts to perform the upgrade steps. I don't think we'll have time in the 1.7 timeframe (unless there are volunteers?), and I would be fine relying on manual upgrade testing this cycle, but it would be nice to see this automated for 1.8.

pipejakob on 21 Jun 2017

@pipejakob Agreed that we might want to target v1.8
Do we have docs on how GCE/GKE does this?

luxas on 21 Jun 2017

GCE does it via upgrade.sh. GKE does something similar (but probably more resilient to failures in GCE API calls failing).

roberthbailey on 22 Jun 2017

@fabriziopandini will work on this issue this cycle. @pipejakob If you could give any hints and help Fabrizio to understand how the test-infra/kubernetes-anywhere integration works it would be great
(I'd also be very interested in hearing that)

I think this could be implemented as basically running kubeadm init twice with different versions only:

kubeadm init --kubernetes-version ${K8S_VERSION} ..foo
if [[ ! -z ${K8S_UPGRADE_VERSION} ]]; then
    kubeadm init --kubernetes-version ${K8S_UPGRADE_VERSION} ..foo
fi

luxas on 7 Jul 2017

@pipejakob, @luxas looking forward to help here.
I started to write down some ideas / first list of things to be clarified into this doc
any comment/help will be more than welcome

fabriziopandini on 14 Jul 2017

👍1

Thanks @fabriziopandini! I'll take a look. I also pinged some sig-testing folks....

luxas on 14 Jul 2017

Unfortunately we didn't have enough resources to tackle this in the v1.7 timeframe and now during the v1.8 cycle we've been focusing on the v1.7->v1.8 upgrade route.
Since our v1.6->v1.7 upgrades were alpha, that was kind of okay anyway, but far from ideal.

We'll do better for the v1.7->v1.8 upgrade

luxas on 18 Aug 2017

Is there a new issue for automated tests for 1.7 -> 1.8 upgrades? If not, I think it's worth keeping this one open. @jessicaochen is working on a solution to automate the kubeadm init way of upgrading clusters, with an eye on parameterizing the source/destination versions so that we can easily set up jobs to test 1.6.x -> 1.7.x and 1.7.x -> 1.8.x.

pipejakob on 18 Aug 2017

Yeah, now I saw the proposal for automated upgrades @jessicaochen has been working on; and it seems like we have an owner for this :tada:

luxas on 18 Aug 2017

Is there a new issue for automated tests for 1.7 -> 1.8 upgrades?

Not yet, we should create one

luxas on 18 Aug 2017

Assigning you @jessicaochen, but please work on https://github.com/kubernetes/kubeadm/issues/402 primarily as that is more important.

luxas on 6 Sep 2017

Have PR to get the upgrade tests working: https://github.com/kubernetes/test-infra/pull/4732

Meanwhile, I will manually test the route to ensure we get some feedback for 1.8 release. Tracked by: https://github.com/kubernetes/kubeadm/issues/466

jessicaochen on 26 Sep 2017

👍2

This work is now done AFAICS. Thanks @jessicaochen!

luxas on 13 Oct 2017

Was this page helpful?

0 / 5 - 0 ratings