Cluster-api: [cabpk] Etcd data directory shouldn't be required

Created on 26 Sep 2019  Â·  11Comments  Â·  Source: kubernetes-sigs/cluster-api

/kind bug

What steps did you take and what happened:
The following KubeadmConfig resource illustrates using a custom image repository for a local etcd configuration. Please note only the relevant fields are shown below:

apiVersion: bootstrap.cluster.x-k8s.io/v1alpha2
kind: KubeadmConfig
spec:
  clusterConfiguration:
    etcd:
      local:
        imageRepository: my.custom.repo.io
        imageTag: v0.0.0

However, when submitting this resource via kubectl, the following error occurs:

$ kubectl apply -f kubeadmconfig.yaml
error: error validating "kubeadmconfig.yaml": error validating data: ValidationError(KubeadmConfig.spec.clusterConfiguration.etcd.local): missing required field "dataDir" in io.x-k8s.cluster.bootstrap.v1alpha2.KubeadmConfig.spec.clusterConfiguration.etcd.local; if you choose to ignore these errors, turn validation off with --validate=false

Disabling validation and submitting the same resource to the API server results in a long-form error with the same root cause -- the etcd field dataDir is required.

I encountered this last week and used a work-around of just setting dataDir to its default value. However, @figo recently encountered the same issue, so I decided to track down why this was happening as it seems like it shouldn't. Here's what I discovered:

  • It looks like kubeadm API types have been copied into the kubeadm bootstrapper rather than have kustomize point to an external repository for the kubeadm CRDs.
  • Taking a look at the etcd type, the dataDir is in fact required.
  • Then it hit me -- when run as part of kubeadm, the provided configuration is processed to ensure proper defaults are in place.
  • That’s why the field dataDir may be marked as required and still have documentation that states:

    DataDir is the directory etcd will place its data. Defaults to “/var/lib/etcd”.

  • When empty and run as part of kubeadm, the field is defaulted to /var/lib/etcd

  • However, since the API server prior to 1.15 did not support defaulters except via webhooks, submitting a config with an etcd configuration and an empty dataDir field would result in a validation error.

My suggestion is this: as long as the bootstrapper is using copied kubeadm types anyway, why not add an +optional and omitempty. For example, take the following LocalEtcd structure:

// LocalEtcd describes that kubeadm should run an etcd cluster locally
type LocalEtcd struct {
    // ImageMeta allows to customize the container used for etcd
    ImageMeta `json:",inline"`

    // DataDir is the directory etcd will place its data.
    // Defaults to "/var/lib/etcd".
    DataDir string `json:"dataDir"`

    // ExtraArgs are extra arguments provided to the etcd binary
    // when run inside a static pod.
    ExtraArgs map[string]string `json:"extraArgs,omitempty"`

    // ServerCertSANs sets extra Subject Alternative Names for the etcd server signing cert.
    ServerCertSANs []string `json:"serverCertSANs,omitempty"`
    // PeerCertSANs sets extra Subject Alternative Names for the etcd peer signing cert.
    PeerCertSANs []string `json:"peerCertSANs,omitempty"`
}

The DataDir field could be updated to:

    // DataDir is the directory etcd will place its data.
    // Defaults to "/var/lib/etcd".
    // +optional
    DataDir string `json:"dataDir,omitempty"`

Then, the kubeadm controller would ensure that the documented default value /var/lib/etcd would still be applied to any KubeadmConfig resources that includes spec.clusterConfiguration.etcd.local and an empty dataDir field. This change would allow specifying a custom etcd image repoository or version without having to worry about the unrelated data directory.

What did you expect to happen:
The resource should have been accepted with an omitted etcd.dataDir since the documentation states the field has a default value of /var/lib/etcd.

cc @chuckha @ncdc @figo

kinbug prioritimportant-soon

All 11 comments

+1. The main reason we copied the kubeadm types into this repo was to make some adjustments around fields we wanted users to be able to omit.

it was more than minor adjustments, the controller-runtime generator did not work with default kubeadm types.

However it's been useful to keep them in-tree to mark some fields as optional to improve user experience without the defaulter, as mentioned in the issue.

This sounds good to me. We should audit the in-tree types to see if there are any other fields that don't default but say they will.

it was more than minor adjustments, the controller-runtime generator did not work with default kubeadm types.

i forgot the reasons for that. was it only the lack of ObjectMeta?

tracking issue for the kubeadm v1beta3 config, which might land in 1.17:
https://github.com/kubernetes/kubeadm/issues/1796

let's discuss any adjustments that this project requires.

no, it was the lack of json struct tags. I opened and fixed the issue in v1beta1 and v1beta2 upstream types already

https://github.com/kubernetes/kubernetes/commit/74ba11b0cd28f3aa5dcf9a79e0293eeca77c5f12

can the upstream types be used now after this change?

I wouldn't migrate to upstream types that rely on defaults until K8s 1.18 when server-side defaulters will have been available for three versions at that point. I agree with @chuckha that having the types locally enables behaviors not considered for client-side only types.

Note from @timothysc - kubeadm v1beta3 types will make a similar change

I moved this from CABPK to CAPI

/assign
/active

Was this page helpful?
0 / 5 - 0 ratings