Kubeadm: e2e test the kubeadm v1.6->v1.7 upgrade route

Created on 17 Jun 2017  路  17Comments  路  Source: kubernetes/kubeadm

The title says it all.
This is probably a little harder one to figure out with kubernetes-anywhere, but I'd love to get it sorted out.

  1. Spawn a release-1.6 cluster
  2. Make sure all nodes joined correctly (all nodes in the Ready state)
  3. Upgrade to release-1.7
  4. Bonus: Test out kubeadm join again if we can
  5. Run all e2e tests
  6. Verify that the Node Authorizer is working after upgrade as well: https://github.com/kubernetes/kubeadm/issues/310

@pipejakob How does this sound? Easy/hard to implement? How are the other upgrade suites working?

aretesting help wanted prioritimportant-soon

Most helpful comment

Have PR to get the upgrade tests working: https://github.com/kubernetes/test-infra/pull/4732

Meanwhile, I will manually test the route to ensure we get some feedback for 1.8 release. Tracked by: https://github.com/kubernetes/kubeadm/issues/466

All 17 comments

There are a wide variety of upgrade test suites: some upgrade just the master and run e2es, some run new e2es against old clusters (to verify compatibility), some run old e2es against newer clusters (testing compatibility in the other direction), and the newest ones (that @krousey added) run tests while upgrading the cluster.

I think that the last set will be the most difficult to port to kubernetes-anywhere, as I believe that they rely on the fact that we drain nodes before removing them from the cluster during an upgrade.

As an aside, if we could implement this automated test, it would pave the way for us to start replacing kube-up with kubernetes-anywhere+kubeadm in our upgrade test suites, which would be awesome.

@roberthbailey The testing logic that relies on draining is guarded by cloud provider. That is, it's only enabled for GKE and GCE right now.

Do the tests run and/or pass on other cloud providers? I'm guessing that we implicitly rely on the draining behavior for successful test runs.

I can't speak for how else we rely on GKE or GCE in the upgrade tests, but the upgrade tests do have a mode where they can run on a cluster upgrade that doesn't drain nodes.

I can certainly think of dirty ways to implement this in kubernetes-anywhere, but nothing particularly elegant comes to mind. The most flexible thing I can think of is adding a pseudo-phase or subcommand that is capable of running arbitrary commands on the master or nodes, which would have different implementations based on the phase1 provider. For GCE (the only cloud provider used by our current e2es), this would just run gcloud compute ssh $HOST --command $COMMAND, and I assume equivalents exist for AWS/Azure.

Then, we can run through the normal installation as-is, but after verification finishes, invoke the subcommand to push CLI invocations to the hosts to perform the upgrade steps. I don't think we'll have time in the 1.7 timeframe (unless there are volunteers?), and I would be fine relying on manual upgrade testing this cycle, but it would be nice to see this automated for 1.8.

@pipejakob Agreed that we might want to target v1.8
Do we have docs on how GCE/GKE does this?

GCE does it via upgrade.sh. GKE does something similar (but probably more resilient to failures in GCE API calls failing).

@fabriziopandini will work on this issue this cycle. @pipejakob If you could give any hints and help Fabrizio to understand how the test-infra/kubernetes-anywhere integration works it would be great
(I'd also be very interested in hearing that)

I think this could be implemented as basically running kubeadm init twice with different versions only:

kubeadm init --kubernetes-version ${K8S_VERSION} ..foo
if [[聽! -z ${K8S_UPGRADE_VERSION} ]]; then
    kubeadm init --kubernetes-version ${K8S_UPGRADE_VERSION} ..foo
fi

@pipejakob, @luxas looking forward to help here.
I started to write down some ideas / first list of things to be clarified into this doc
any comment/help will be more than welcome

Thanks @fabriziopandini! I'll take a look. I also pinged some sig-testing folks....

Unfortunately we didn't have enough resources to tackle this in the v1.7 timeframe and now during the v1.8 cycle we've been focusing on the v1.7->v1.8 upgrade route.
Since our v1.6->v1.7 upgrades were alpha, that was kind of okay anyway, but far from ideal.

We'll do better for the v1.7->v1.8 upgrade

Is there a new issue for automated tests for 1.7 -> 1.8 upgrades? If not, I think it's worth keeping this one open. @jessicaochen is working on a solution to automate the kubeadm init way of upgrading clusters, with an eye on parameterizing the source/destination versions so that we can easily set up jobs to test 1.6.x -> 1.7.x and 1.7.x -> 1.8.x.

Yeah, now I saw the proposal for automated upgrades @jessicaochen has been working on; and it seems like we have an owner for this :tada:

Is there a new issue for automated tests for 1.7 -> 1.8 upgrades?

Not yet, we should create one

Assigning you @jessicaochen, but please work on https://github.com/kubernetes/kubeadm/issues/402 primarily as that is more important.

Have PR to get the upgrade tests working: https://github.com/kubernetes/test-infra/pull/4732

Meanwhile, I will manually test the route to ensure we get some feedback for 1.8 release. Tracked by: https://github.com/kubernetes/kubeadm/issues/466

This work is now done AFAICS. Thanks @jessicaochen!

Was this page helpful?
0 / 5 - 0 ratings