Cluster-api: kubeadm bootstrap provider does not take into account Cluster APIServerPort

Created on 29 Dec 2019 · 22Comments · Source: kubernetes-sigs/cluster-api

What steps did you take and what happened:

Create a cluster with a non-default value for cluster.Spec.ClusterNetwork.APIServerPort (e.g. 6444 instead of 6443)
Create a controller Node using the kubeadm bootstrap provider
The generated cloud-init ignores the APIServerPort setting and uses the default 6443 (or whatever is specified in the controller KubeadmConfig).

What did you expect to happen:

For the specified apiServer port to be used.

Anything else you would like to add:

I am not sure this is a bug, just seems like strange behaviour - as an infrastructure provider I would need to parse the resulting cloud-init or otherwise to work out which port to forward connections to or health check, so I would expect the kubeadm bootstrap provider to override the default or whatever is configured in the KubeadmConfig with the value given by the cluster.

Environment:

Cluster-api version: 0.2.8
kubeadm bootstrap provider version: 0.1.5
Minikube/KIND version: None (using kubernetes infrastructure provider on GKE)
Kubernetes version: (use kubectl version): 1.16.3
OS (e.g. from /etc/os-release): Container Optimised Linux for GKE cluster (kind images for cluster-api Nodes)

/kind bug

arebootstrap kinbug lifecyclfrozen

Source

dippynark

Most helpful comment

This should also take into account the spec.livenessProbe.port field as well. It defaults to 6443 and if you change APIServerPort you need to ensure spec.livenessProbe.port is updated as well or your container will just get endlessly rebooted.

In fact, I just hit this bug and have had to resort to the hackiest of hacks to fix this:

spec:
  kubeadmConfigSpec:
    postKubeadmCommands:
    - chmod +x /tmp/kube-apiserver-quickfix.sh && /tmp/kube-apiserver-quickfix.sh
    files:
    - path: /tmp/kube-apiserver-overlay.yaml
      content: |
        apiVersion: v1
        kind: Pod
        spec:
          livenessProbe:
            httpGet:
              port: 443
    - path: /tmp/kube-apiserver-quickfix.sh
      content: |
        # Stop kube-apiserver and edit YAML
        mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/kube-apiserver.yaml
        # Patch the YAML, super hacky :).
        echo "Started patching kube-apiserver"
       cd /tmp && docker run --rm -v "${PWD}":/workdir mikefarah/yq yq m -i --overwrite --autocreate=false /tmp/kube-apiserver.yaml /tmp/kube-apiserver-overlay.yaml
        # Sleep to let container download and run
        sleep 30 && mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml
        echo "Finished patching kube-apiserver"

stgarf on 5 May 2020

👍2

All 22 comments

@dippynark It is generally expected that the infrastructure provider provides some type of persistent API endpoint for the cluster (for example, the AWS provider uses an ELB). This is the reason why the bootstrap provider doesn't currently take into account the value of cluster.Spec.ClusterNetwork.APIServerPort.

detiber on 30 Dec 2019

@detiber so APIServerPort should be the port exposed by the ELB in that scenario? cluster_types.go is saying it's the bind port.

Or you mean it doesn't matter that the bootstrap provider ignores it because the persistent API endpoint can use a different port if necessary?

dippynark on 30 Dec 2019

We should probably take a closer look at this after the holidays.
/cc @ncdc @vincepri

detiber on 30 Dec 2019

👍1

/cc

SataQiu on 31 Dec 2019

Going forward, Cluster.Spec.ControlPlaneEndpoint.Port is what users should use to contact their cluster. This essentially means it is the port for the cluster's load balancer, assuming you have one.

I wonder if we still need Cluster.Spec.ClusterNetwork.APIServerPort?

ncdc on 9 Jan 2020

Cluster.Spec.ControlPlaneEndpoint.Port is meant to be set only by a controller, given that we copy it from the infrastructure provider on each reconciliation loop.

The other variable was meant to give a way for users to specify which port they wanted [the load balancer] to be exposed. It was a conversation had long times ago, but we can probably fix the meaning in this release.

vincepri on 9 Jan 2020

Cluster.Spec.ControlPlaneEndpoint.Port is meant to be set only by a controller, given that we copy it from the infrastructure provider on each reconciliation loop.

Long-term, my vision is that this doesn't come from a cluster infra provider, but instead comes from the MachineLoadBalancer. And there will be a place in the MLB spec for the user to specify the desired port. Which means we wouldn't need Cluster.Spec.ClusterNetwork.APIServerPort

ncdc on 9 Jan 2020

That's fair, until then though I'd not remove functionality that's being used / we added after a bug report, unless there is an alternative ready. wdyt?

vincepri on 9 Jan 2020

I'd generally agree, although I can't find any code that actually uses Cluster.Spec.ClusterNetwork.APIServerPort?

ncdc on 9 Jan 2020

Nevermind, bad searching

ncdc on 9 Jan 2020

So infrastructure providers _may_ use Cluster.Spec.ClusterNetwork.APIServerPort to inform which port to use for the provider's load balancer. Otherwise, this field is 100% unused by core cluster-api.

ncdc on 9 Jan 2020

Correct, that's the current intent for this field

vincepri on 9 Jan 2020

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 8 Apr 2020

/remove-lifecycle stale
@dippynark any updates or action items on this issue?

vincepri on 20 Apr 2020

/area bootstrap

vincepri on 20 Apr 2020

@vincepri IMO it makes sense to either remove the cluster.Spec.ClusterNetwork.APIServerPort field and always use the default 6443 serving port, or change the kubeadm bootstrap provider to respect this field (i.e. configuring the apiserver to serve on whatever is specified) as well as requiring all infrastructure providers to respect it (i.e. configuring the target port of the apiserver loadbalancer to be what is specified).

dippynark on 21 Apr 2020

I'd prefer to add support to respect the field, but that requires a contract amendment. We should have more clear documentation on what we do today

vincepri on 21 Apr 2020

In fact, I just hit this bug and have had to resort to the hackiest of hacks to fix this:

spec:
  kubeadmConfigSpec:
    postKubeadmCommands:
    - chmod +x /tmp/kube-apiserver-quickfix.sh && /tmp/kube-apiserver-quickfix.sh
    files:
    - path: /tmp/kube-apiserver-overlay.yaml
      content: |
        apiVersion: v1
        kind: Pod
        spec:
          livenessProbe:
            httpGet:
              port: 443
    - path: /tmp/kube-apiserver-quickfix.sh
      content: |
        # Stop kube-apiserver and edit YAML
        mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/kube-apiserver.yaml
        # Patch the YAML, super hacky :).
        echo "Started patching kube-apiserver"
       cd /tmp && docker run --rm -v "${PWD}":/workdir mikefarah/yq yq m -i --overwrite --autocreate=false /tmp/kube-apiserver.yaml /tmp/kube-apiserver-overlay.yaml
        # Sleep to let container download and run
        sleep 30 && mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/kube-apiserver.yaml
        echo "Finished patching kube-apiserver"

stgarf on 5 May 2020

👍2

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot on 3 Aug 2020

/lifecycle frozen

vincepri on 3 Aug 2020

/assign

To re-triage

vincepri on 16 Oct 2020

/milestone v0.4.0

vincepri on 22 Oct 2020

Was this page helpful?

0 / 5 - 0 ratings

Related issues

Implement Scale Subresource for MachineSet

chaosaffe · 6Comments

Bootstrap controllers should set `Spec.Type` when creating and reading Secrets

vincepri · 4Comments

clusterctl: filter out cert-manager-leaderelection ClusterRoleBinding

ncdc · 5Comments

clusterctl pivots to internal cluster, make this optional

oneilcin · 6Comments

Remove the example provider

fabriziopandini · 5Comments