/kind bug
What steps did you take and what happened:
apiserver-etcd-client.{crt,key} files.kubeadmConfigSpec field of the KubeadmControlPlane object in a manifest for a workload cluster.What did you expect to happen:
I expected the Kubeadm bootstrap controller to create the files with the specified content at the specified location.
Anything else you would like to add:
Here is the kubeadmConfigSpec field I used (note the private key material has been redacted):
spec:
infrastructureTemplate:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha3
kind: AWSMachineTemplate
name: capi-etcd-control-plane
kubeadmConfigSpec:
clusterConfiguration:
apiServer:
extraArgs:
cloud-provider: aws
controllerManager:
extraArgs:
cloud-provider: aws
configure-cloud-routes: "false"
etcd:
external:
endpoints:
- https://ip-10-45-16-154.us-west-2.compute.internal:2379
- https://ip-10-45-53-148.us-west-2.compute.internal:2379
- https://ip-10-45-84-141.us-west-2.compute.internal:2379
caFile: /etc/kubernetes/pki/etcd/ca.crt
certFile: /etc/kubernetes/pki/apiserver-etcd-client.crt
keyFile: /etc/kubernetes/pki/apiserver-etcd-client.key
initConfiguration:
nodeRegistration:
kubeletExtraArgs:
cloud-provider: aws
name: '{{ ds.meta_data.local_hostname }}'
joinConfiguration:
nodeRegistration:
kubeletExtraArgs:
cloud-provider: aws
name: '{{ ds.meta_data.local_hostname }}'
files:
- path: /etc/kubernetes/pki/apiserver-etcd-client.crt
content: |
-----BEGIN CERTIFICATE-----
MIIC+TCCAeGgAwIBAgIIR8jHKnG7IGcwDQYJKoZIhvcNAQELBQAwEjEQMA4GA1UE
AxMHZXRjZC1jYTAeFw0yMDA0MDEyMzI0MjdaFw0yMTA0MDEyMzI1MDdaMD4xFzAV
BgNVBAoTDnN5c3RlbTptYXN0ZXJzMSMwIQYDVQQDExprdWJlLWFwaXNlcnZlci1l
dGNkLWNsaWVudDCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBANIe194P
oqtyvXnM7a3cpTzzWVWL+EEBCvznBRNVgjQ4jMWYKR3Coq71x11zk/dox/6FGS0S
wwaCBMIvnc7Mh5l9Tr9w4ZQdn5WXed0eqhNq/Eo7L0KZx0W7EtYH5t+ogQ8tVsIl
4kU3zVAmeBFuSpXz3/JOq9Tbx9qW2NewbYBosiWrm+r+BEyAD9iwY0Melm7wJzMe
T+vYKwRXhUbPxdFod6x9dHp/bHoxXaQLeVlwjHauvFnc1Q7B4AmoGjpGJ4KR85pM
z+EAilatQ8yAt57e05yME7hOgO6MFA0CLkOQjmqhFzlLPknBz32oLc/cCftFKlnN
5YfAutmbtBLYNAsCAwEAAaMnMCUwDgYDVR0PAQH/BAQDAgWgMBMGA1UdJQQMMAoG
CCsGAQUFBwMCMA0GCSqGSIb3DQEBCwUAA4IBAQCQ20D/Z0+1yMyoqweetcY9j7gw
CVj5NI477I/g+TBqIUb47+0VxPKlimKGH9yzcNNU41EAOVX+tbAhORot4YOe5zp0
VSkJyHt7npI+K+sAtkdUQuC9K1730jCM1XjReuu65vc6dKxdagFAFi0m3EzrjHwb
/aI4rhL8upszNh6UtQlP9EAoJMSwC8VSNdc0nE3Ta/otQNd+8TJui3MsSa2gIRaA
zL9Ztl/yH/Gj/2u4nQuXE1iCE/aWZEfguJwb7756GMaVuDSywdB0oY+HTZhbuJyE
wWKFQ2NF+P3UXMaLlxjMrkttDxENrx47Fh9R1q/hXjr2KA9lE+2N2+HbqEYP
-----END CERTIFICATE-----
- path: /etc/kubernetes/pki/apiserver-etcd-client.key
content: |
<redacted>
replicas: 3
version: v1.17.3
Here is the output of kubectl logs against the bootstrap controller:
I0402 18:47:06.669455 1 listener.go:44] controller-runtime/metrics "msg"="metrics server is starting to listen" "addr"="127.0.0.1:8080"
I0402 18:47:06.669770 1 main.go:132] setup "msg"="starting manager"
I0402 18:47:06.670001 1 leaderelection.go:242] attempting to acquire leader lease capi-kubeadm-bootstrap-system/kubeadm-bootstrap-manager-leader-election-capi...
I0402 18:47:06.670140 1 internal.go:356] controller-runtime/manager "msg"="starting metrics server" "path"="/metrics"
I0402 18:47:24.066941 1 leaderelection.go:252] successfully acquired lease capi-kubeadm-bootstrap-system/kubeadm-bootstrap-manager-leader-election-capi
I0402 18:47:24.068022 1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource" "controller"="kubeadmconfig" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{},"status":{}}}
I0402 18:47:24.168701 1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource" "controller"="kubeadmconfig" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"clusterName":"","bootstrap":{},"infrastructureRef":{}},"status":{"bootstrapReady":false,"infrastructureReady":false}}}
I0402 18:47:24.269613 1 controller.go:164] controller-runtime/controller "msg"="Starting EventSource" "controller"="kubeadmconfig" "source"={"Type":{"metadata":{"creationTimestamp":null},"spec":{"controlPlaneEndpoint":{"host":"","port":0}},"status":{"infrastructureReady":false,"controlPlaneInitialized":false}}}
I0402 18:47:24.370272 1 controller.go:171] controller-runtime/controller "msg"="Starting Controller" "controller"="kubeadmconfig"
I0402 18:47:24.370663 1 controller.go:190] controller-runtime/controller "msg"="Starting workers" "controller"="kubeadmconfig" "worker count"=10
I0402 18:47:24.472482 1 kubeadmconfig_controller.go:267] controllers/KubeadmConfig "msg"="ConfigOwner is not a control plane Machine. If it should be a control plane, add the label `cluster.x-k8s.io/control-plane: \"\"` to the Machine" "kind"="Machine" "kubeadmconfig"={"Namespace":"default","Name":"capi-etcd-md-0-dtmjt"} "name"="capi-etcd-md-0-7d49f97f5c-ph98v" "version"="6204997"
I0402 18:47:24.472816 1 kubeadmconfig_controller.go:267] controllers/KubeadmConfig "msg"="ConfigOwner is not a control plane Machine. If it should be a control plane, add the label `cluster.x-k8s.io/control-plane: \"\"` to the Machine" "kind"="Machine" "kubeadmconfig"={"Namespace":"default","Name":"capi-etcd-md-0-c9w2d"} "name"="capi-etcd-md-0-7d49f97f5c-bnqvb" "version"="6205014"
I0402 18:47:24.474533 1 kubeadmconfig_controller.go:267] controllers/KubeadmConfig "msg"="ConfigOwner is not a control plane Machine. If it should be a control plane, add the label `cluster.x-k8s.io/control-plane: \"\"` to the Machine" "kind"="Machine" "kubeadmconfig"={"Namespace":"default","Name":"capi-etcd-md-0-fcwxx"} "name"="capi-etcd-md-0-7d49f97f5c-9fpvk" "version"="6205016"
I0402 18:47:24.579069 1 kubeadmconfig_controller.go:298] controllers/KubeadmConfig "msg"="Creating BootstrapData for the init control plane" "kind"="Machine" "kubeadmconfig"={"Namespace":"default","Name":"capi-etcd-control-plane-2sqs7"} "name"="capi-etcd-control-plane-7jgjz" "version"="6204332"
E0402 18:47:24.681536 1 runtime.go:78] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 267 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x14f8c80, 0x245b2f0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:74 +0xa3
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:48 +0x82
panic(0x14f8c80, 0x245b2f0)
/usr/local/go/src/runtime/panic.go:679 +0x1b2
sigs.k8s.io/cluster-api/util/secret.(*Certificate).AsFiles(...)
/workspace/util/secret/certificates.go:318
sigs.k8s.io/cluster-api/util/secret.Certificates.AsFiles(0xc000467c50, 0x5, 0x6, 0x7, 0xc0005aaf30, 0xc000170e00)
/workspace/util/secret/certificates.go:361 +0x52d
sigs.k8s.io/cluster-api/bootstrap/kubeadm/internal/cloudinit.NewInitControlPlane(0xc0001b4460, 0xc0001b4460, 0x6, 0x18d9280, 0xc0000ac060, 0x18f3480)
/workspace/bootstrap/kubeadm/internal/cloudinit/controlplane_init.go:55 +0x7e
sigs.k8s.io/cluster-api/bootstrap/kubeadm/controllers.(*KubeadmConfigReconciler).handleClusterNotInitialized(0xc0000fd700, 0x18d9280, 0xc0000ac060, 0xc0006d9c38, 0xc000114200, 0x0, 0x0, 0x0)
/workspace/bootstrap/kubeadm/controllers/kubeadmconfig_controller.go:355 +0x79f
sigs.k8s.io/cluster-api/bootstrap/kubeadm/controllers.(*KubeadmConfigReconciler).Reconcile(0xc0000fd700, 0xc00049d039, 0x7, 0xc00049f000, 0x1d, 0xc000406c00, 0x0, 0x0, 0x0)
/workspace/bootstrap/kubeadm/controllers/kubeadmconfig_controller.go:240 +0x126a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0001bc540, 0x1548d80, 0xc000529b00, 0xc000325d00)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x162
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0001bc540, 0x0)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0001bc540)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0003dfb50)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0003dfb50, 0x3b9aca00, 0x0, 0x45ed01, 0xc0004c23c0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc0003dfb50, 0x3b9aca00, 0xc0004c23c0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x328
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x139466d]
goroutine 267 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/runtime/runtime.go:55 +0x105
panic(0x14f8c80, 0x245b2f0)
/usr/local/go/src/runtime/panic.go:679 +0x1b2
sigs.k8s.io/cluster-api/util/secret.(*Certificate).AsFiles(...)
/workspace/util/secret/certificates.go:318
sigs.k8s.io/cluster-api/util/secret.Certificates.AsFiles(0xc000467c50, 0x5, 0x6, 0x7, 0xc0005aaf30, 0xc000170e00)
/workspace/util/secret/certificates.go:361 +0x52d
sigs.k8s.io/cluster-api/bootstrap/kubeadm/internal/cloudinit.NewInitControlPlane(0xc0001b4460, 0xc0001b4460, 0x6, 0x18d9280, 0xc0000ac060, 0x18f3480)
/workspace/bootstrap/kubeadm/internal/cloudinit/controlplane_init.go:55 +0x7e
sigs.k8s.io/cluster-api/bootstrap/kubeadm/controllers.(*KubeadmConfigReconciler).handleClusterNotInitialized(0xc0000fd700, 0x18d9280, 0xc0000ac060, 0xc0006d9c38, 0xc000114200, 0x0, 0x0, 0x0)
/workspace/bootstrap/kubeadm/controllers/kubeadmconfig_controller.go:355 +0x79f
sigs.k8s.io/cluster-api/bootstrap/kubeadm/controllers.(*KubeadmConfigReconciler).Reconcile(0xc0000fd700, 0xc00049d039, 0x7, 0xc00049f000, 0x1d, 0xc000406c00, 0x0, 0x0, 0x0)
/workspace/bootstrap/kubeadm/controllers/kubeadmconfig_controller.go:240 +0x126a
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0001bc540, 0x1548d80, 0xc000529b00, 0xc000325d00)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:256 +0x162
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0001bc540, 0x0)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232 +0xcb
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker(0xc0001bc540)
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211 +0x2b
k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0003dfb50)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152 +0x5e
k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0003dfb50, 0x3b9aca00, 0x0, 0x45ed01, 0xc0004c23c0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153 +0xf8
k8s.io/apimachinery/pkg/util/wait.Until(0xc0003dfb50, 0x3b9aca00, 0xc0004c23c0)
/go/pkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88 +0x4d
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func1
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:193 +0x328
Environment:
kubectl version): 1.17.4 (client), 1.16.4 (server)/etc/os-release): Ubuntu 18.04.4 (using CAPA AMI)/kind documentation
/priority important-soon
/milestove v0.5.x
I just tested this again using CAPA 0.5.2 and CAPI 0.3.3 (latest releases of both). I also modified the files section to write out the API server etcd client certificate, the API server etcd client key, and the etcd CA certificate (as well as having the <clustername>-etcd Secret defined in the management cluster). It appears that the error still persists (I didn't compare the logs from this latest iteration against the previous logs above, but they looked similar).
Let me know if there are additional tests I should/could run, or if there is additional information it would be helpful for me to gather.
Moving to cluster-api repo, since this is an error in the KubeadmConfig controller.
/milestone v0.3.x
What's expected behavior here? It seems like this feature is not very well defined. According to https://github.com/kubernetes-sigs/cluster-api/blob/master/bootstrap/kubeadm/docs/external-etcd.md we only support creating secrets with specific names. Do we just want to fix the NPE and return an error?
In the general case that the keypair is pre-defined in the config, we don't actually need to lookup or create the secret at all, we should skip that step, but only for the certs/keys that are injected via config.
In the more specific case of specifying the use of an external etcd cluster, we should likely also skip the creation of the etcd CA keypair and the associated secrets and should likely throw an error if the etcd client cert/key are not defined in the provided config, since we have no way to generate keypairs for external etcd clusters.
Not sure I understand, maybe I can explain my understanding better. The workflow as I understand it from the external etcd doc is
my-cluster-apiserver-etcd-client, my-cluster-etcdclusterConfiguration.etcd.external.caFile, certFile, and keyFile. When you say "keypair is pre-defined/injected in the config", you mean that it is provided in the files section?
Let me test this again---I may have missed some things (thanks for that link, @benmoss!). I'll work on this again today/tomorrow and report back. It looks as if all the certificates should be provided via ~ConfigMaps~ Secrets on the management cluster, and not via the files directive. Does that sound correct?
Yes, the way it was built it seems the intention was that users would put them in secrets (not ConfigMaps) and the controller would handle mounting those as files. It's probably possible for us to support a user putting them as files as well, but it's not supported now.
@benmoss Yes, sorry, I should have said Secrets (my mistake, thank you for correcting me).
OK, I was able to make this work using Secrets instead of injecting the certificate data in the configuration. I did find some errors in the external etcd doc, for which I'll open an issue (and for which I'm happy to work on a PR). That being said, I think this issue is still valid, but feel free to correct me.
Yup, we certainly shouldn't crash I was hoping to clarify whether we wanted to support this use-case or if we wanted to error to the user that the required secrets couldn't be found. I'm still not sure, @vincepri did you have any opinion on this?
We definitely shouldn't crash or panic, I didn't have time to actually go in the code details yet though, can you do an investigation and come up with a plan of action / PR?
/assign