What steps did you take and what happened:
What did you expect to happen:
I expect the kcp status can update but I can not find the method for external etcd or my understanding is wrong.
Anything else you would like to add:
Environment:
kubectl version): v1.17.4/etc/os-release):/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]
Hi @zanghao2 and thanks for filing this issue, the original implementation and proposal for KubeadmControlPlane was meant to be used only with stacked etcd.
From the proposal:
To manage etcd clusters in any topology other than stacked etcd (externally managed etcd clusters can still be leveraged).
This requirement was meant to reduce the support scope for KCP and to allow the controller to fully manage a control plane whole lifecycle. For example, today KCP during an upgrade will make sure to properly remove, add, and upgrade etcd member as new control plane machines join the cluster.
I'm not sure if we have any plans to support external etcd deployments in the future, we've talked about using etcdadm in the past to avoid managing etcd directly, but we'll need an extensive design to start the discussion.
Hope this helps, cc @detiber and @randomvariable as well
There may be a bug in KCP attempting to check the etcd status even when it shouldn't.
@zanghao2, could you provide a sample of your kubeadmcontrolplane configuration, and ideally if you can get the logs of the cluster api controllers, that would also help.
@vincepri we never intended to remove support for external etcd when specified in the kubeadm configuration that is passed to KCP, we likely just need to short-circuit some of the health checking that we are doing when external etcd is configured
Might be worthwhile to amend the proposal with some clarification, that wasn't immediately clear it's/should be supported
@vincepri It's called out specifically in the goals section: "To support pre-existing, user-managed, external etcd clusters", and the scenario is also called out in the behavioral sections as it applies to the various operations
Got it, the non-goal threw me off, but I guess it does say managed :D
/help
@CecileRobertMichon:
This request has been marked as needing help from a contributor.
Please ensure the request meets the requirements listed here.
If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-help command.
In response to this:
/help
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/milestone v0.3.x
My kcp configuration is as follows: @randomvariable
kind: KubeadmControlPlane
metadata:
name: account-test-control-plane
namespace: cluster-test
spec:
infrastructureTemplate:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSMachineTemplate
name: account-test-control-plane
kubeadmConfigSpec:
clusterConfiguration:
etcd:
external:
endpoints:
- https://etcd01.demo-cluster.test:2379
- https://etcd02.demo-cluster.test:2379
- https://etcd03.demo-cluster.test:2379
caFile: /etc/kubernetes/pki/ca.pem
certFile: /etc/kubernetes/pki/etcd.pem
keyFile: /etc/kubernetes/pki/etcd-key.pem
certificatesDir: /etc/kubernetes/pki
imageRepository: google_containers
dns:
type: CoreDNS
apiServer:
extraArgs:
service-account-signing-key-file: /etc/kubernetes/pki/sa.key
service-account-issuer: kubernetes.default.svc
service-account-key-file: /etc/kubernetes/pki/sa.pub
cloud-provider: aws
feature-gates: "APIResponseCompression=true,DynamicAuditing=true,,LocalStorageCapacityIsolationFSQuotaMonitoring=true,QOSReserved=true,SCTPSupport=true,ServiceNodeExclusion=true,BoundServiceAccountTokenVolume=true,NonPreemptingPriority=true,BalanceAttachedNodeVolumes=true,APIPriorityAndFairness=true"
authorization-mode: "Node,RBAC"
runtime-config: authentication.k8s.io/v1beta1=true
controllerManager:
extraArgs:
feature-gates: "APIResponseCompression=true,DynamicAuditing=true,LocalStorageCapacityIsolationFSQuotaMonitoring=true,QOSReserved=true,SCTPSupport=true,ServiceNodeExclusion=true,BoundServiceAccountTokenVolume=true,NonPreemptingPriority=true,BalanceAttachedNodeVolumes=true,APIPriorityAndFairness=true"
cloud-provider: aws
service-account-private-key-file: /etc/kubernetes/pki/sa.key
flex-volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
scheduler:
extraArgs:
feature-gates: "APIResponseCompression=true,DynamicAuditing=true,LocalStorageCapacityIsolationFSQuotaMonitoring=true,QOSReserved=true,SCTPSupport=true,ServiceNodeExclusion=true,BoundServiceAccountTokenVolume=true,NonPreemptingPriority=true,BalanceAttachedNodeVolumes=true,APIPriorityAndFairness=true"
initConfiguration:
nodeRegistration:
kubeletExtraArgs:
cloud-provider: aws
volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
feature-gates: "APIResponseCompression=true,DynamicAuditing=true,LocalStorageCapacityIsolationFSQuotaMonitoring=true,QOSReserved=true,SCTPSupport=true,ServiceNodeExclusion=true,BoundServiceAccountTokenVolume=true,NonPreemptingPriority=true,BalanceAttachedNodeVolumes=true,APIPriorityAndFairness=true"
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
cloud-provider: aws
volume-plugin-dir: "/opt/libexec/kubernetes/kubelet-plugins/volume/exec/"
feature-gates: "APIResponseCompression=true,DynamicAuditing=true,LocalStorageCapacityIsolationFSQuotaMonitoring=true,QOSReserved=true,SCTPSupport=true,ServiceNodeExclusion=true,BoundServiceAccountTokenVolume=true,NonPreemptingPriority=true,BalanceAttachedNodeVolumes=true,APIPriorityAndFairness=true"
replicas: 1
version: v1.15.3
The CLUSTER_NAME-etcd only contains tls.crt, When check kcp status I get the log "etcd tls key does not exist for cluster cluster-demo cluster-test". I check the code in the GetWorkloadCluster锛孖t's try to check external etcd likes staked etcd. Even if I add the corresponding tls.key. when I change replicase of kcp, I will get the error, because the method scaleUpControlPlane uses the logic of stacked etcd to check the status of etcd and try to get pods of etcd. I think it may be necessary to add new methods to handle the logic of external etcd.
@zanghao2 I'm still making up my mind around this issue, but I was expecting to get the etcd keys to be passed as a Files into the kubeadmConfigSpec, otherwise, I don't get how those files are going to be placed into the machine.
Could you kindly help me to clarify this point?
ignore my previous comment, the bootstrap provider takes charge of injecting the files on the machines starting from secrets
/assign
/lifecycle active
@fabriziopandini Thanks for your reply. I have fix the code for external etcd, but I do not know the plan for external etcd. In my opinion, I do not care or check the status of external etcd, I just check ca.crt ,clinet.crt,client.key but this idea may not be in line with the idea of cluster-api.
@zanghao2 PTAL to the related PR, it will be great to have you opinion on the proposed changes
@fabriziopandini Thanks, Your code logic is consistent with my code, I hope this pr can be merged.