Kubeadm: [Roadmap] Improve kubeadm support for declarative approaches/git-ops

Created on 30 Sep 2020  路  5Comments  路  Source: kubernetes/kubeadm

Kubeadm, being a CLI, does not play well with declarative approaches/git-ops workflows.

Assuming that kubeadm is divided in two main parts

  1. bootstrapping a node (transforming a machine into a node: init, join)
  2. managing an existing node (e.g. upgrades, renew certs, changing a node)

This issue is about collecting ideas and define a viable path for making 2 possible using declarative approaches, sometimes referred also as in-place mutations.

For this first iteration, I consider 1 out of scope, mainly because bootstrapping nodes with a declarative approach is already covered by Cluster API and it is clearly out of the scope of kubeadm.

kindesign prioritimportant-longterm

Most helpful comment

It seems like a lot of k8s admins have a procedure where upgrading an existing production cluster tends to be (much) more dangerous than building a new one.

i think no matter what we do with kubernetes upgrades we will not be able to fully guarantee zero failures to the users, unless this is fully managed by some high level tooling that understand everything that the user has and wants - including node host details, infrastructure availability and all caveats of the current and next k8s version.

kubeadm or the operator can encode some details about the next k8s version or the node host, but that's all.

the so called "blue / green" cluster upgrades may seem as the better option in the eyes of the user, since the user has the control to scrap the old cluster only once the new cluster is fully working. but they also require infrastructure that some users on self hosted bare metal simply don't have.

Whereas automating the construction process drives towards a very desirable workflow for automating and simplifying the process: simply remove nodes from an existing production cluster description and add them to the new cluster.

we call these node re-place upgrades and the Cluster API is doing them. your project may have recreated parts of Cluster API, kubespray or kops, which are tools that are higher level than kubeadm.

All 5 comments

Prior discussion from https://github.com/kubernetes/kubeadm/issues/1698

@timothysc

As a Kubernetes Operator I would like to enable be able to declaratively control configuration changes, and upgrades in a systematic fashion.

@fabriziopandini

IMO the kubeadm operator should be responsible for two things

  • In place mutations of kubeadm generated artifacts
  • Orchestration of such mutations across nodes
    Instead, I think that we should consider out of scope everything that fits under the management of infrastructure or it is related to the management of "immutable" nodes (where "Immutable" = any operation done deploying a new node and removing the old one)

@neolit123

my other top question, this can end up being not-so-secure.

For the kubeadm operator, I think we should focus on the first use case, "declaratively control configuration changes", given that this is not supported by kubeadm now and it was a top priority in the recent survey

Interesting proposal. What I've struggled with over the past year using kubeadm is specifically what I addressed with my own scripts that (procedurally) builds clusters (on github: k8smaker). The best practices for configuring a bare metal cluster is pretty complex. Doing so with AWS is too. The preconditioning script depends strongly on the underlying OS involved. But having built it modularly, I can see how the cluster construction (init, join) can be made completely extensible while providing a fully declarative interface to the user.

It requires:

  • ssh credentials to access any new node with sudo privileges (from whence kubeadm or the proposed operator executes)
  • a preconditioning script that configures the OS from a clean state
  • a decommissioning script that resets the OS to an unused state
  • a configuration CRD that describes the nodes that should be part of the cluster

I offer an opinion: I realize this was specifically stated as out-of-scope for this proposal. I'm suggesting it should be the focus instead of day 2 operations. It seems like a lot of k8s admins have a procedure where upgrading an existing production cluster tends to be (much) more dangerous than building a new one. By automating more of the upgrades, it adds a "magicalness" to that process which results in inevitable breakage being more severe rather than less. Whereas automating the construction process drives towards a very desirable workflow for automating and simplifying the process: simply remove nodes from an existing production cluster description and add them to the new cluster.

Thanks for the consideration.

It seems like a lot of k8s admins have a procedure where upgrading an existing production cluster tends to be (much) more dangerous than building a new one.

i think no matter what we do with kubernetes upgrades we will not be able to fully guarantee zero failures to the users, unless this is fully managed by some high level tooling that understand everything that the user has and wants - including node host details, infrastructure availability and all caveats of the current and next k8s version.

kubeadm or the operator can encode some details about the next k8s version or the node host, but that's all.

the so called "blue / green" cluster upgrades may seem as the better option in the eyes of the user, since the user has the control to scrap the old cluster only once the new cluster is fully working. but they also require infrastructure that some users on self hosted bare metal simply don't have.

Whereas automating the construction process drives towards a very desirable workflow for automating and simplifying the process: simply remove nodes from an existing production cluster description and add them to the new cluster.

we call these node re-place upgrades and the Cluster API is doing them. your project may have recreated parts of Cluster API, kubespray or kops, which are tools that are higher level than kubeadm.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

helphi picture helphi  路  3Comments

kvaps picture kvaps  路  3Comments

atoato88 picture atoato88  路  4Comments

jessfraz picture jessfraz  路  3Comments

danderson picture danderson  路  3Comments