Detailed Description
I have been working on an infrastructure provider implementation using Kubernetes Pods as Nodes and was wondering if it would be useful to add to the list of out-of-the-box providers: https://github.com/dippynark/cluster-api-provider-kubernetes
The provider would be useful for testing and experimentation rather than production so potentially it isn't the right fit.
/kind feature
As far as understand this provider is based on kind, so it falls on the same category of CAPD, developer tools.
Currently, CAPD is not included in the list of out-of-the-box providers, but there is documentation about how to use it in the Cluster API books under developer tools. What about using the same approach for this provider as well?
Also, should we rename this cluster-api-provider-pods? (Kubernetes on AWS -> CAPAws, Kubernetes on pods -> CAPPod)
sounds good, will write something up
Each Node is its own Pod so a CAPKubernetes cluster is formed of multiple Pods - I'd say CAPod would correspond with something like CAPEc2 rather than CAPAws, so I feel saying Kubernetes is the infrastructure provider makes more sense.
cc @elmiko
this sounds like a cool project. i am curious though, is the intent to create a more robust development/debug provider or is there an intention to have this be production grade at some point?
@elmiko it started as just a good way to learn about cluster api, but I was hoping it'd be useful for something like testing controllers or distributed applications by allowing pipelines to spin up temporary clusters on existing clusters - there wasn't an intention for production grade though.
One potential future use case would be to spin up just the control plane with this provider but connect real cloud provider nodes for testing, but I don't think this type of thing is supported by cluster api.
One potential future use case would be to spin up just the control plane with this provider but connect real cloud provider nodes for testing, but I don't think this type of thing is supported by cluster api.
It could be, if there is a control plane provider that's pod-based folks can use it as object reference in Cluster.ControlPlaneRef, and then infrastructure references are actual infrastructure providers
Potentially that's already possible then, have only tried with Pod based worker machines.
I can't really find a good place to put this provider in the developer docs to fit in with what is there for docker, unless a separate page would be good? EIther way the docker provider would probably always be more appropriate for local development anyway.
I think the list of infrastructure providers would be a good place to put a link, not sure if that list is reserved for 'real' infrastructure providers though.
I can't really find a good place to put this provider in the developer docs to fit in with what is there for docker, unless a separate page would be good?
i am trying to sync up with folks about re-doing the docker provider page and perhaps it would fit best as a parallel document?
i am still working on the pull request, but i am proposing removing docker from the quickstart guide and creating a single document instruction in the developer section for docker. it might be nice to create a section that would fit this provider as well. i would like to make it clear to our users where docker fits in the ecosystem, and by extension kubernetes sounds like it would fit this as well.
my one hesitation, is that we currently use the docker provider for testing and @chuckha has created an issue to focus on improving the docker provider ( #2738 ). i wouldn't want to create a mixed message for users, but it would be nice to have a section for these more experimental type providers if only as examples for others.
one final point, i see some tests in the kubernetes provider repo, we would need to make sure that these continue to get run as part of our automation to make it easier for future maintenance.
edit:
there is also this proposal to improve the doc structure in general, #2121
Makes sense and especially with confusion around mixed messages for users as I don't think the Kubernetes provider would be great for testing compared to using the docker one.
I can hold off on any changes related to this issue then until the ones you linked are merged - if that new docker provider section gets created I can potentially copy the format to create a parallel one for the Kubernetes provider.
The e2e tests are using a recent version of the testing framework and all run locally on kind so I don't think they'll be any issues running them as part of automation (although the tests are fairly simple atm).
I really like the simplicity of the service-based load balancer approach taken with your provider. I'm wondering if it would make sense to merge the different approaches taken between CAPD and this provider.
As an aside, I would suggest not using any references to UID, since it would not persist across a clusterctl move operation.
I'm wondering if it would make sense to merge the different approaches taken between CAPD and this provider.
that's an interesting thought. just so i'm following, this would mean using kind to create a kubernetes cluster then using the kubernetes-provider to spawn clusters within that kind cluster? (not very different from what we do now)
I'm wondering if it would make sense to merge the different approaches taken between CAPD and this provider.
that's an interesting thought. just so i'm following, this would mean using kind to create a kubernetes cluster then using the kubernetes-provider to spawn clusters within that kind cluster? (not very different from what we do now)
Exactly. At least on the cluster infrastructure side the kubernetes-provider offers a great deal of simplicity compared to what we are doing in CAPD today. I'd have to do a more in-depth comparison between the two on the machine infrastructure side, but I would guess they aren't too different outside of the libraries and scaffolding they are using to do the instance creation and bootstrapping.
makes sense, thanks for the explanation!
@detiber thanks for taking a look, yeah a lot of the behaviour was inspired (or copied) from the docker provider - I think the main difference around machine provisioning is that the kubernetes provider creates a systemd unit on each Node which runs the cloudinit script and uses a single exec just to start the unit, whilst the docker provider runs each cloudinit command as a separate exec.
As an aside, I would suggest not using any references to UID, since it would not persist across a clusterctl move operation.
I think that was intentional iirc because workload clusters currently have to run on the management cluster, so if someone does a move the Nodes need to be recreated in the new management cluster anyway. In general I went with the idea that the Nodes (Pods) should never restart as that would reset the COW layer which wouldn't happen on a VM, so using the UID as the provider ID and restartPolicy: Never should ensure that once a Node joins the cluster its lifecycle is tied to the exact Pod instance that it corresponds to.
Potentially there could be some changes there to allow the kubernetes provider to manage workload clusters on remote clusters and having it so that a move keeps the existing workload cluster on the old management cluster.
One interesting point about this provider is the possibility to span a test cluster on many nodes.
I second @detiber comments that it will be interesting to explore a possible convergence CAPD/Kubernetes provider, but supporting move is a hard-requirement IMO
@fabriziopandini I guess I'm just not too sure what a move implementation would look like since currently the controller assumes we're deploying to the local cluster.
The only thing I can think of would be for the KubernetesCluster resource to trigger the creation of a ServiceAccount with enough permissions to manage the local cluster. A corresponding token, CA cert and external apiserver IP (maybe guessed from kubectl get endpoints kubernetes -o yaml or operator specified) would give a kubeconfig which can be moved with a clusterctl move so that it can continue management.
I can't see many use cases where this would be valuable though and it would add quite a lot of complexity and in some cases would not work (e.g. if source and target cluster can't reach eachother), although I guess it's more a case of making this provider fully compatible with cluster-api..
I can't see many use cases where this would be valuable
@dippynark supporting move is part of a common workflow (from bootstrap to a self-hosted cluster) and we should ensure this is tested by our E2E. So this is a requirement if we want to merge CAPD and the Kubernetes provider
the controller assumes we're deploying to the local cluster.
I think that as other providers do, this provider should accept some parameters specifying where the infrastructure lives. In this case, it should be a config map with the kubeconfig for the hosting cluster (and eventually resolve to localhost if this config map exists).
it would add quite a lot of complexity
I hope the change can be scoped to the connections step...
in some cases would not work (e.g. if source and target cluster can't reach each other)
I guess this is fine. it applyes to all the other provider as well
@fabriziopandini a user provided one sounds a lot nicer, would it be a problem if the default behaviour remained as the current behaviour where a move would result in cluster recreation as all the local Nodes/Pods have disappeared?
And then if a secret ref is provided to a kubeconfig it is used instead of the in-cluster config and it would succeed if the source apiserver is still reachable?
I guess I can find some inspiration in the cluster-api core code for watching with multiple clients.
Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.
Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
@fejta-bot: Closing this issue.
In response to this:
Rotten issues close after 30d of inactivity.
Reopen the issue with/reopen.
Mark the issue as fresh with/remove-lifecycle rotten.Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
@detiber thanks for taking a look, yeah a lot of the behaviour was inspired (or copied) from the docker provider - I think the main difference around machine provisioning is that the kubernetes provider creates a systemd unit on each Node which runs the cloudinit script and uses a single exec just to start the unit, whilst the docker provider runs each cloudinit command as a separate exec.
I think that was intentional iirc because workload clusters currently have to run on the management cluster, so if someone does a move the Nodes need to be recreated in the new management cluster anyway. In general I went with the idea that the Nodes (Pods) should never restart as that would reset the COW layer which wouldn't happen on a VM, so using the UID as the provider ID and
restartPolicy: Nevershould ensure that once a Node joins the cluster its lifecycle is tied to the exact Pod instance that it corresponds to.Potentially there could be some changes there to allow the kubernetes provider to manage workload clusters on remote clusters and having it so that a move keeps the existing workload cluster on the old management cluster.