Kubeadm: RFE: kubeadm operator

Created on 31 Jul 2019  路  22Comments  路  Source: kubernetes/kubeadm

As a Kubernetes Operator I would like to enable be able to declaratively control configuration changes, and upgrades in a systematic fashion.

This will require a KEP.
Entire feature-set is TBD.

WIP KEP: https://hackmd.io/@QlB2bmbhS-aeuDlwOCH9Yw/HkidAVXlS
presentation: https://docs.google.com/presentation/d/1ckEbp_4-9Q90UNV_UwvQQ7MDF6J9Jpl1EdRHUo5pWn0/edit#slide=id.g633cabb4a3_0_5

https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/20190916-kubeadm-operator.md

/kind feature

kindesign kinfeature prioritimportant-longterm

Most helpful comment

WRT to scope, IMO the kubeadm operator should be responsible for two things

  • In place mutations of kubeadm generated artifacts
  • Orchestration of such mutations across nodes

Instead, I think that we should consider out of scope everything that fits under the management of infrastructure or it is related to the management of "immutable" nodes

I would divide the uses case for the operator in two groups:
1. Improve the UX for cluster lifecycle activities already supported by kubeadm e.g.
- upgrades
- client certificate renewal

  1. Enable cluster lifecycle activities not yet supported by kubeadm e.g.
  2. certificate rotation
  3. "change the cluster"

All 22 comments

questions on scope:

What is the operators main function / deployment topology?
Is it deployed to a cluster after kubeadm init so that it can self-manage the cluster?
Does it live in a parent management cluster?
Do you run it as a commandline tool (does it read config from a file and/or the cluster's API?)

Do we intend for this to be consumed by a Cluster API bootstrap provider?
Is the config for the operator a ConfigMap / CRD?

How does the operator mutate the cluster?
( Perhaps it could schedule privileged Pods, Jobs, or a DaemonSet to mutate each Node )

WRT to scope, IMO the kubeadm operator should be responsible for two things

  • In place mutations of kubeadm generated artifacts
  • Orchestration of such mutations across nodes

Instead, I think that we should consider out of scope everything that fits under the management of infrastructure or it is related to the management of "immutable" nodes

I would divide the uses case for the operator in two groups:
1. Improve the UX for cluster lifecycle activities already supported by kubeadm e.g.
- upgrades
- client certificate renewal

  1. Enable cluster lifecycle activities not yet supported by kubeadm e.g.
  2. certificate rotation
  3. "change the cluster"

@stealthybox and @fabriziopandini covered some of the questions and comments that i have.

but this overlaps with action that CAPI performs i need to understand more about the demand.
who demands this?

@timothysc is your plan to enable some actions that CAPI now performs on the side of this operator? at the same time allow non-CAPI users to use the same actions.

How does the operator mutate the cluster?
( Perhaps it could schedule privileged Pods, Jobs, or a DaemonSet to mutate each Node )

^ my other top question, this can end up being not-so-secure.
mutating host paths from a priv-Pod is a no-no.

cc @dlipovetsky
for perhaps an interesting topic.

mutating host paths from a priv-Pod is a no-no.

@fabriziopandini @timothysc do you remember my partially silly proposal to transfer kubeadm control-plane certs over encrypted sockets? i did it to showcase what may be a way to manage a kubeadm cluster using a socket protocol.

i'd argue that it will be more secure that any host action we try to perform from a privileged Pod.

Working on a KEP + POC
/lifecycle active

Instead, I think that we should consider out of scope everything that fits under the management of infrastructure or it is related to the management of "immutable" nodes

I would divide the uses case for the operator in two groups:

  1. Improve the UX for cluster lifecycle activities already supported by kubeadm e.g.
  • upgrades
  • client certificate renewal

If nodes are "immutable," does it follow that:

  • Certificates on the node can be rotated only by deploying a new node and removing the old one. (I realize there some is some nuance here, because kubelet itself renews its client certificate)
  • A node must be upgraded by deploying a new node and removing the old one.

@dlipovetsky agreed.

"Immutable" means that any operation is done deploying a new node and removing the old one, while "mutable" means that any operation is done via in-place mutations

kubeadm-operator is meant to support the "mutable" approach, while "immutable" operations IMO are out of scope

but you are right, there is nuance here 馃槈, e.g. nothing prevents an administrator to mix up "Immutable" and "mutable" operations on the same cluster

kubeadm-operator is meant to support the "mutable" approach

Thanks a lot for clarifying @fabriziopandini! I wasn't aware of kubeadm-operator before seeing this issue, so I didn't have the right context. So kubeadm-operator is _not_ for a CAPI-managed cluster, which requires the "immutable" approach.

@timothysc: The label(s) kind/ cannot be appled. These labels are supported: api-review, community/discussion, community/maintenance, community/question, cuj/build-train-deploy, cuj/multi-user, platform/aws, platform/azure, platform/gcp, platform/minikube, platform/other

In response to this:

As a Kubernetes Operator I would like to enable be able to declaratively control configuration changes, and upgrades in a systematic fashion.

This will require a KEP.
WIP KEP: https://hackmd.io/@QlB2bmbhS-aeuDlwOCH9Yw/HkidAVXlS

Entire feature-set is TBD.

/kind feature
/assign @fabriziopandini @neolit123

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@timothysc: The label(s) kind/ cannot be appled. These labels are supported: api-review, community/discussion, community/maintenance, community/question, cuj/build-train-deploy, cuj/multi-user, platform/aws, platform/azure, platform/gcp, platform/minikube, platform/other

In response to this:

As a Kubernetes Operator I would like to enable be able to declaratively control configuration changes, and upgrades in a systematic fashion.

This will require a KEP.
Entire feature-set is TBD.

WIP KEP: https://hackmd.io/@QlB2bmbhS-aeuDlwOCH9Yw/HkidAVXlS

/kind feature
/assign @fabriziopandini @neolit123

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

As per kubeadm office hour discussion, we are considering certificate rotation in the scope of the kubeadm operator. I will open an issue to track this properly

@timothysc: The label(s) kind/ cannot be applied. These labels are supported: api-review, community/discussion, community/maintenance, community/question, cuj/build-train-deploy, cuj/multi-user, platform/aws, platform/azure, platform/gcp, platform/minikube, platform/other

In response to this:

As a Kubernetes Operator I would like to enable be able to declaratively control configuration changes, and upgrades in a systematic fashion.

This will require a KEP.
Entire feature-set is TBD.

WIP KEP: https://hackmd.io/@QlB2bmbhS-aeuDlwOCH9Yw/HkidAVXlS
presentation: https://docs.google.com/presentation/d/1ckEbp_4-9Q90UNV_UwvQQ7MDF6J9Jpl1EdRHUo5pWn0/edit#slide=id.g633cabb4a3_0_5

/kind feature
/assign @fabriziopandini @neolit123

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@timothysc: The label(s) kind/ cannot be applied. These labels are supported: api-review, community/discussion, community/maintenance, community/question, cuj/build-train-deploy, cuj/multi-user, platform/aws, platform/azure, platform/gcp, platform/minikube, platform/other

In response to this:

As a Kubernetes Operator I would like to enable be able to declaratively control configuration changes, and upgrades in a systematic fashion.

This will require a KEP.
Entire feature-set is TBD.

WIP KEP: https://hackmd.io/@QlB2bmbhS-aeuDlwOCH9Yw/HkidAVXlS
presentation: https://docs.google.com/presentation/d/1ckEbp_4-9Q90UNV_UwvQQ7MDF6J9Jpl1EdRHUo5pWn0/edit#slide=id.g633cabb4a3_0_5

/kind feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

@timothysc: The label(s) kind/ cannot be applied, because the repository doesn't have them

In response to this:

As a Kubernetes Operator I would like to enable be able to declaratively control configuration changes, and upgrades in a systematic fashion.

This will require a KEP.
Entire feature-set is TBD.

WIP KEP: https://hackmd.io/@QlB2bmbhS-aeuDlwOCH9Yw/HkidAVXlS
presentation: https://docs.google.com/presentation/d/1ckEbp_4-9Q90UNV_UwvQQ7MDF6J9Jpl1EdRHUo5pWn0/edit#slide=id.g633cabb4a3_0_5

https://github.com/kubernetes/enhancements/blob/master/keps/sig-cluster-lifecycle/kubeadm/20190916-kubeadm-operator.md

/kind feature

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

/remove-lifecycle stale

Would love this operator with this Story :
As a Kubernetes Operator, I can declare new hosts where kubeadm is applied

Background :
With ssh access, create kubeadm templates, get tokens and run joins

EDIT : maybe this current "Operator" should be a "pure kubeadm" implementation of Cluster API ?

@jseguillon
IMO bootstrapping a new node is out of scope of the kubeadm operator.
Also, I strongly believe that kubeadm is and should be a low-level tool that can be used by Cluster API or by other tools, not a Cluster API implementation

/close
in favor of https://github.com/kubernetes/kubeadm/issues/2317
the initial work on the operator helped in exploring this field, however, we should now focus on defining a clean surface for API an this requires some modeling work and a more detailed use cases

@fabriziopandini: Closing this issue.

In response to this:

/close
in favor of https://github.com/kubernetes/kubeadm/issues/2317
the initial work on the operator helped in exploring this field, however, we should now focus on defining a clean surface for API an this requires some modeling work and a more detailed use cases

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings