Cluster-api: New release is breaking old clusterctl

Created on 14 Jul 2020  路  22Comments  路  Source: kubernetes-sigs/cluster-api

What steps did you take and what happened:
Ran clusterctl init --infrastructure aws to install CAPI onto my kind cluster. Ran clusterctl config ... | k apply -f- to create a cluster as the init succeeded, but then get errors from the webhook controllers like

Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "default.cluster.cluster.x-k8s.io": Post https://capi-webhook-service.capi-webhook-system.svc:443/mutate-cluster-x-k8s-io-v1alpha3-cluster?timeout=30s: dial tcp 10.102.79.72:443: connect: connection refused
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "default.machinedeployment.cluster.x-k8s.io": Post https://capi-webhook-service.capi-webhook-system.svc:443/mutate-cluster-x-k8s-io-v1alpha3-machinedeployment?timeout=30s: dial tcp 10.102.79.72:443: connect: connection refused

Saw that the capi-controller-manager deployment was failing in capi-webhook-system, this is the logs of the pod:

invalid argument "MachinePool=${EXP_MACHINE_POOL:=false},ClusterResourceSet=${EXP_CLUSTER_RESOURSE_SET:=false}" for "--feature-gates" flag: invalid value of MachinePool=${EXP_MACHINE_POOL:=false}, err: strconv.ParseBool: parsing "${EXP_MACHINE_POOL:=false}": invalid syntax

What did you expect to happen:
Components come up and work as before.

Anything else you would like to add:

clusterctl version: &version.Info{Major:"0", Minor:"3", GitVersion:"v0.3.6", GitCommit:"beff1e1df8e80d3a4fdb996c86409793ee491d47", GitTreeState:"clean", BuildDate:"2020-05-15T16:00:09Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Minikube/KIND version: kind v0.8.1 go1.14.2 linux/amd64
  • OS (e.g. from /etc/os-release): Linux

/kind bug
/area clusterctl

areclusterctl kinbug

All 22 comments

@benmoss: The label(s) area/ cannot be applied, because the repository doesn't have them

In response to this:

What steps did you take and what happened:
Ran clusterctl init --infrastructure aws to install CAPI onto my kind cluster. Ran clusterctl config ... | k apply -f- to create a cluster as the init succeeded, but then get errors from the webhook controllers like

Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "default.cluster.cluster.x-k8s.io": Post https://capi-webhook-service.capi-webhook-system.svc:443/mutate-cluster-x-k8s-io-v1alpha3-cluster?timeout=30s: dial tcp 10.102.79.72:443: connect: connection refused
Error from server (InternalError): error when creating "STDIN": Internal error occurred: failed calling webhook "default.machinedeployment.cluster.x-k8s.io": Post https://capi-webhook-service.capi-webhook-system.svc:443/mutate-cluster-x-k8s-io-v1alpha3-machinedeployment?timeout=30s: dial tcp 10.102.79.72:443: connect: connection refused

Saw that the capi-controller-manager deployment was failing in capi-webhook-system, this is the logs of the pod:

invalid argument "MachinePool=${EXP_MACHINE_POOL:=false},ClusterResourceSet=${EXP_CLUSTER_RESOURSE_SET:=false}" for "--feature-gates" flag: invalid value of MachinePool=${EXP_MACHINE_POOL:=false}, err: strconv.ParseBool: parsing "${EXP_MACHINE_POOL:=false}": invalid syntax
Usage of /manager:
invalid argument "MachinePool=${EXP_MACHINE_POOL:=false},ClusterResourceSet=${EXP_CLUSTER_RESOURSE_SET:=false}" for "--feature-gates" flag: invalid value of MachinePool=${EXP_MACHINE_POOL:=false}, err: strconv.ParseBool: parsing "${EXP_MACHINE_POOL:=false}": invalid syntax

What did you expect to happen:
Components come up and work as before.

Anything else you would like to add:

clusterctl version: &version.Info{Major:"0", Minor:"3", GitVersion:"v0.3.6", GitCommit:"beff1e1df8e80d3a4fdb996c86409793ee491d47", GitTreeState:"clean", BuildDate:"2020-05-15T16:00:09Z", GoVersion:"go1.13.8", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Minikube/KIND version: kind v0.8.1 go1.14.2 linux/amd64
  • OS (e.g. from /etc/os-release): Linux

/kind bug
/area clusterctl

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@fabriziopandini @wfernandes

We either have to revert the variable defaulting in the new new yaml, or we say clusterctl v0.3.6 can't work with v0.3.7+ yaml. The former is probably better in the immediate term while we continue to think about ways to avoid this.

Might be worthwhile to add a check and disallow to install newer versions of Cluster API with an older clusterctl

At some point (be it now or in the future), we'll want to lock the clusterctl feature set and expect it to work with future versions of the yaml (up until some point where we clearly define when we're allowed to make breaking changes).

Do we need to do a backport fix for clusterctl v0.3.6? Or should we update the docs?

I've updated the release notes for now, I'm not sure if there is any action item that we want to take

looks like the yaml does need to be patched with https://github.com/kubernetes-sigs/cluster-api/pull/3341

Let's merge that in, I'll update the release components in the v0.3.7 release for now

The published artifacts have been fixed, ptal

In case other ppl happen to run into this issue. Adding this here for visibility:
From v0.3.7 release notes

Update to the latest released version of clusterctl before to install or upgrade component files, kubectl apply individual yaml files is not supported or suggested. Starting with this release (v0.3.7), clusterctl supports extended variable template substitutions, if you encounter any issues related to templating or environment variables update to the latest version or file an issue.

Sounds like this is documented now, should we close this issue?

Sounds good, thanks @benmoss @wfernandes @CecileRobertMichon for bringing this up; a follow-up item would be how to better agree changes like this in the future and avoid breaking changes between patch releases. Thankfully we're an alpha project and we'll use this experience to learn how to do better in the future. As @ncdc mentioned above, we need to lock in clusterctl features at some point, or make sure that the required dependencies are very explicit.

/close

@vincepri: Closing this issue.

In response to this:

Sounds good, thanks @benmoss @wfernandes @CecileRobertMichon for bringing this up; a follow-up item would be how to better agree changes like this in the future and avoid breaking changes between patch releases. Thankfully we're an alpha project and we'll use this experience to learn how to do better in the future. As @ncdc mentioned above, we need to lock in clusterctl features at some point, or make sure that the required dependencies are very explicit.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

/milestone v0.3.x

Can anyone confirm if this should be an issue with CAPI for kind? still getting this issue on my first run with:
clusterctl init --core cluster-api:v0.3.8 --bootstrap kubeadm:v0.3.8 --control-plane kubeadm:v0.3.8 --infrastructure docker:v0.3.0

Anyone any ideas (first timer here)

@Emc1992 what are you seeing in the CAPI pod logs?

@Emc1992 what are you seeing in the CAPI pod logs?

Looks to be same as above
invalid argument "MachinePool=${EXP_MACHINE_POOL:=false},ClusterResourceSet=${EXP_CLUSTER_RESOURCE_SET:=false}" for "--feature-gates" flag: invalid value of MachinePool=${EXP_MACHINE_POOL:=false}, err: strconv.ParseBool: parsing "${EXP_MACHINE_POOL:=false}": invalid syntax Usage of /manager:

what does clusterctl version show? Is it also v0.3.8?

clusterctl version

Dur - think you've cracked it - must of been following an old guide and c/p'd the wrong version - its clusterctl version: &version.Info{Major:"0", Minor:"3", GitVersion:"v0.3.2-dirty" from curl -L https://github.com/kubernetes-sigs/cluster-api/releases/download/v0.3.2/clusterctl-linux-amd64 -o clusterctl I'll retry tomorrow - thanks

@Emc1992
From v0.3.7 release notes

Update to the latest released version of clusterctl before to install or upgrade component files, kubectl apply individual yaml files is not supported or suggested. Starting with this release (v0.3.7), clusterctl supports extended variable template substitutions, if you encounter any issues related to templating or environment variables update to the latest version or file an issue.

Yeap, got passed this issue, sorry for not updating here. Dealing with a vsphere specific option now but will check when I get a chance

Was this page helpful?
0 / 5 - 0 ratings