One concept we discussed a few times is to have the global operator responsible for managing namespace operators.
It would need to:
Some questions we need to tackle:
Some aspects to consider (notes from discussions with @nkvoll):
ClusterRole or would a set of Roles for the namespaces the 'global' operator is supposed to manage suffice?Some additional thoughts on global vs. namespace operator RBAC permissions, with 2 use cases.
Based on those 2 use cases, I think we should move away from the global/namespace operator split, to consider deploying N operators with different permissions and configuration options.
Use case: I belong to organization X, I have access to a single namespace of organization X k8s cluster. I need to do everything in that single namespace.
I think we should be able to deploy the operator(s) in the namespace, to handle resources in that same namespace, with restricted permissions to that namespace only (service account + roles + role bindings).
In this particular case:
Potential solution:
Here, I think we want a single operator with all controllers baked-in here, restricted to that single namespace.
Use case: I belong to organization X, we have several namespaces for different purposes, we want to perform CCR/CCS across multiple ES clusters in multiple namespaces. But there is no way I'm going to run your operator with super-large cluster-wide permissions on all namespaces.
I think this is a valid use case for any production-grade large k8s cluster.
In this particular case:
So far, the license controller that we intended to run in the global operator requires read/write access to secrets in all namespaces.
Potential solutions:
I think we should drop the "global/namespace" operator concept, because it's not good enough to represent what we want to achieve. Instead, move to a single operator concept with multiple configuration options:
Potential large-scale company setup we could achieve here:
By having this fully-configurable operator, I think it's then easier to extract a few "common" deployment setups that user may like to use (we can still do something similar to our original 1 global + N namespace operators here).
Also, I don't think we need an operator to manage operators. I think it's fine to just have people deploying the operators they need, without having them watch each other.
./elastic-operator-cli generate operator --controllers=elasticsearch,licensing --namespace=teamX,teamY). It would still be the responsibility of a human to the run kubectl apply -f generated-operator.yaml.I'm sorry for the super long post here 馃槃
If this (or part of this) makes sense, I'll be happy to turn it to a formal design proposal.
@sebgl awesome analysis! Reading this though I think that there are only small semantic differences between a global operator as we understood up to now (access to all namespaces via ClusterRole) and a more restricted notion of global which is what we seem to be converging to in this discussion.
I think the concepts outlined in the global operator proposal are still valid, but we need maybe the ability to restrict its domain to a set of namespaces.
I also think that we always said that a single operator deployment should be a supported mode of operation (with all controllers in one process)
Maybe global is just a misleading name at this point and we are really talking about a super-operator, parent or multi-namespace operator (all bad names I know) in order to express the fact the we want to have the ability to run certain control loops just once for multiple namespaces (licensing) and solve cross-namespace concerns (CCR/CCS).
What other operators out there seem to be doing:
tl;dr either:
(but they don't have cross-cluster features)
etcd-operator
RBAC guide: https://github.com/coreos/etcd-operator/blob/master/doc/user/rbac.md
They suggest 2 ways of deploying the operator:
prometheus-operator
RBAC guide: https://github.com/coreos/prometheus-operator/blob/master/Documentation/rbac.md
Design doc: https://github.com/coreos/prometheus-operator/blob/master/Documentation/design.md
It seems limited to watching and handling resources in a single namespace (the one it's deployed in). But need cluster-wide permissions to deploy the CRD (not super clear to be honest).
airflow-operator
Seems to be using kubebuilder's default cluster-wide permission (one operator per cluster): https://github.com/GoogleCloudPlatform/airflow-operator/tree/master/config/rbac
kube-db databases operators
Install doc: https://kubedb.com/docs/0.9.0/setup/install/
Runs in the kube-system namespace by default, with cluster-wide access.
Mongo-db operator
Seems to be running in its own namespace, and manages resources in its own namespace: https://github.com/mongodb/mongodb-enterprise-kubernetes
Oracle MySQL operator
Install doc: https://github.com/oracle/mysql-operator/blob/master/docs/tutorial.md#configuration
Can be installed either cluster-wide or per-namespace (manages resources in its own namespace).
NATS operator
Install doc: https://github.com/nats-io/nats-operator
Can be installed either cluster-wide or per-namespace (manages resources in its own namespace).
GCP Spark operator
Helm chart doc: https://github.com/helm/charts/tree/master/incubator/sparkoperator
A single operator with cluster-wide permissions.
Vault Operator
Link: https://github.com/coreos/vault-operator
One operator per namespace, manages resources in its own namespace.
I'll make sure to turn what we discussed today and this post into a formal "configurable-operator" proposal.
@sebgl 馃憤
Also just to document the outcome of our meeting:
Regarding multi-namespaces watches in the controller-runtime:
There is already an issue open for it, as a follow-up for the one-namespace restriction: https://github.com/kubernetes-sigs/controller-runtime/issues/218
Looks like it's long-termed planned 馃憤
operator-sdk folks seem to want that feature as well, and might contribute to the controller-runtime: https://github.com/operator-framework/operator-sdk/issues/767
Meanwhile, the issue above suggests an interesting workaround: implement our own Manager that embeds the controller-runtime Manager, but override the cache to support something like prometheus-operator MultiListWatcher.
My take on it would be to:
Let's keep this issue open as a meta-issue for the configurable operator. ARD design proposal.
2 child issues:
Admission control via webhooks slightly complicates the picture here.
Assumption: the user wants to lock down the RBAC permissions for the operator as much as possible
Hi All, In our Openshift production environment we manage more than 6K namespaces in a multitenant way, from our security point of view it is critical to have operator's with support for per namespace deployment ( with Role and RoleBindings on that namespace).
Hey @roldancer , ECK supports two modes of deployment - at a Global level watching all namespaces as well as on a per namespace level. Does this satisfy your requirement for deploying in OpenShift?
We discussed offline how our initial "global" and "namespace" operators concept are a bit irrelevant and confusing in practice, and decided to remove the existing pre-built manifests via https://github.com/elastic/cloud-on-k8s/issues/2254 to only keep the all-in-one version.
The same global vs. namespace configuration can still be achieved by customizing the operator manifests. The operator can:
This goes into the direction of the above discussion. https://github.com/elastic/cloud-on-k8s/blob/master/docs/design/0005-configurable-operator.md discusses the need for an easier way to configure the operator yaml manifests.
I'm closing this issue now.
@sebgl Hi, I landed here searching for how to deploy the operator and CRDs inside a single namespace. It seems like this discussion is more about how to provide the option of global vs namespace and implies both are possible currently?
So does that mean it is currently possible to deploy the CRDs and the operator to a single namespace (requiring nothing outside the namespace) with some tweaks to the all-in-one as an interim solution for those of us limited to a single namespace on a multi-tenant cluster?
If it is possible, is there an example? I imagine the ClusterRoles and ClusterRoleBindings need to become Roles and RoleBindings and I imagine there are a few more tweaks?
Copy-paste of my answer from https://discuss.elastic.co/t/deploy-eck-to-a-single-namespace/211495/8:
I think (not tested in a while) this is possible but requires some tweaking.
Basically you can change the ClusterRole and ClusterRoleBinding from the all-in-one manifests to their corresponding Role and RoleBinding translations, with your desired namespace.
Then, you can patch the operator StatefulSet manifest to be deployed in the desired namespace. Also make sure the --namespaces flag of the operator cmd matches the namespace you want the operator to work with (probably the same it's deployed in).
However, you can only deploy the CRDs cluster-wide, AFAIK it is not possible to limit a CustomResourceDefinition resource to a particular namespace. And it looks it's not going to be supported anytime soon.
I understand this feels a bit hacky. I'm opening an issue to track this so we can come up with an easier way to generate your own flavor of the manifests: https://github.com/elastic/cloud-on-k8s/issues/2406
Most helpful comment
Some additional thoughts on global vs. namespace operator RBAC permissions, with 2 use cases.
Based on those 2 use cases, I think we should move away from the global/namespace operator split, to consider deploying N operators with different permissions and configuration options.
Use cases
Case 1: there's only one namespace I can use
Use case: I belong to organization X, I have access to a single namespace of organization X k8s cluster. I need to do everything in that single namespace.
I think we should be able to deploy the operator(s) in the namespace, to handle resources in that same namespace, with restricted permissions to that namespace only (service account + roles + role bindings).
In this particular case:
Potential solution:
Here, I think we want a single operator with all controllers baked-in here, restricted to that single namespace.
Case 2: I don't want to grant your global operator cluster-wide admin permissions
Use case: I belong to organization X, we have several namespaces for different purposes, we want to perform CCR/CCS across multiple ES clusters in multiple namespaces. But there is no way I'm going to run your operator with super-large cluster-wide permissions on all namespaces.
I think this is a valid use case for any production-grade large k8s cluster.
In this particular case:
So far, the license controller that we intended to run in the global operator requires read/write access to secrets in all namespaces.
Potential solutions:
Outcomes from those those 2 use cases
Proposal
Making it all configurable
I think we should drop the "global/namespace" operator concept, because it's not good enough to represent what we want to achieve. Instead, move to a single operator concept with multiple configuration options:
Potential large-scale company setup we could achieve here:
By having this fully-configurable operator, I think it's then easier to extract a few "common" deployment setups that user may like to use (we can still do something similar to our original 1 global + N namespace operators here).
Also, I don't think we need an operator to manage operators. I think it's fine to just have people deploying the operators they need, without having them watch each other.
Technical concerns
I think watching a _list_ of namespaces is a valid use case we can propose/contribute to.
./elastic-operator-cli generate operator --controllers=elasticsearch,licensing --namespace=teamX,teamY). It would still be the responsibility of a human to the runkubectl apply -f generated-operator.yaml.I'm sorry for the super long post here 馃槃
If this (or part of this) makes sense, I'll be happy to turn it to a formal design proposal.