Original discussion: https://kubernetes.slack.com/archives/C8TSNPY4T/p1601405307103200
Following https://github.com/kubernetes-sigs/cluster-api/issues/168, ControlPlaneEndpoint was added as a required Spec field in InfrastructureCluster for v1alpha3: https://cluster-api.sigs.k8s.io/developer/architecture/controllers/cluster.html#infrastructure-provider.
However, this field is treated as Status by multiple infrastructure providers: For example, in CAPA ControlPlaneEndpoint.Host is set to match the value of the LB DNS name in the Status.Network field https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/master/controllers/awscluster_controller.go#L206, which means if you set that value as a user it will be ignored and overwritten.
Furthermore, as an infrastructure provider, it is difficult to use controlPlaneEndpoint to support configuring the API Server LB or even maybe BYO control plane IP, because it doesn鈥檛 have enough infra details. That means we end up with a very awkward user API where either the DNS Name info is duplicated (like https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/652/files#diff-61d5bf70dea2fc23b9a923ab3bbd84daR156), or the DNS Name is in a different place than the Public IP / LB spec.
Proposal is to change the infrastructure provider contract in v1alpha4 to make controlPlaneEndpoint a required _status_ field in the InfrastructureCluster object. This allows infrastructure provider to report the Cluster's control plane endpoint to Cluster API while leaving it up to each infra provider to decide if/how to make the API Server endpoint configurable by the user, support BYO LB/IP, etc.
The API type of ControlPlaneEndpoint should not need to change, that is:
// APIEndpoint represents a reachable Kubernetes API endpoint.
type APIEndpoint struct {
// The hostname on which the API server is serving.
Host string `json:"host"`
// The port on which the API server is serving.
Port int32 `json:"port"`
}
The same requirements should still apply, that is:
- MUST route API requests to a live kube-apiserver replica, if one is available.
- MUST be defined before the first control plane replica is created, so that it is added to the SubjectAltNames of the kube-apiserver server certificate.
- MUST stay the same for the entire life of the cluster.
- MUST be a DNS record, or an IP.
/kind feature
/kind proposal
/milestone v0.4.0
It may be worthwhile to revisit #1250 when considering this
cc @alexeldeib
@ncdc I'm not sure how this proposal affects a Load Balancer provider, I think that would be an implementation detail. The controlPlaneEndpoint type would not change and CAPI shouldn't care where that host name/IP is coming from, correct?
I was thinking bigger picture, like potentially removing the control plane endpoint entirely from InfraCluster, and replacing it with a load balancer (referenced by the Cluster itself). Cluster/Machine infra providers could also implement load balancing, but you could mix & match, so the assumption that an InfraCluster always has a control plane endpoint might not hold in the long run. But this is a much bigger effort than moving the InfraCluster controlPlaneEndoint from spec to status.
cc @detiber since you're looking into Load Balancer Provider. Not sure if we should include this as part of the LB provider work or do both separately. Probably the former?
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
/lifecycle frozen
given that this is related to the LB work
Most helpful comment
I was thinking bigger picture, like potentially removing the control plane endpoint entirely from InfraCluster, and replacing it with a load balancer (referenced by the Cluster itself). Cluster/Machine infra providers could also implement load balancing, but you could mix & match, so the assumption that an InfraCluster always has a control plane endpoint might not hold in the long run. But this is a much bigger effort than moving the InfraCluster controlPlaneEndoint from spec to status.