Describe the bug
I deployed the Kiali Operator v1.23.0 using helm chart. For extra info: I created a helm chart and used kiali-operator helm chart as a dependency and let ArgoCD deploy it on one of our internal K8s clusters. Below is the values.yaml for the Kiali Operator:
kiali-operator:
env:
- name: HTTP_PROXY
value: <...>
- name: HTTPS_PROXY
value: <...>
- name: NO_PROXY
value: <...>
watchNamespace: istio-system
clusterRoleCreator: true
cr:
create: false
spec:
deployment:
accessible_namespaces:
- '**'
And the operator got deployed in kiali-operator namespace and started without any errors:
{"level":"info","ts":1600429174.008365,"logger":"cmd","msg":"Go Version: go1.13.10"}
{"level":"info","ts":1600429174.0084157,"logger":"cmd","msg":"Go OS/Arch: linux/amd64"}
{"level":"info","ts":1600429174.0084248,"logger":"cmd","msg":"Version of operator-sdk: v0.17.0"}
{"level":"info","ts":1600429174.0090594,"logger":"cmd","msg":"Watching all namespaces.","Namespace":""}
{"level":"info","ts":1600429175.667903,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":"0.0.0.0:8383"}
{"level":"info","ts":1600429175.6685896,"logger":"watches","msg":"Environment variable not set; using default value","envVar":"WORKER_KIALI_KIALI_IO","default":1}
{"level":"info","ts":1600429175.6688023,"logger":"ansible-controller","msg":"Watching resource","Options.Group":"kiali.io","Options.Version":"v1alpha1","Options.Kind":"Kiali"}
{"level":"info","ts":1600429175.6689615,"logger":"leader","msg":"Trying to become the leader."}
{"level":"info","ts":1600429177.353149,"logger":"leader","msg":"No pre-existing lock was found."}
{"level":"info","ts":1600429177.3674543,"logger":"leader","msg":"Became the leader."}
{"level":"info","ts":1600429180.767992,"logger":"metrics","msg":"Metrics Service object created","Service.Name":"kiali-kiali-operator-metrics","Service.Namespace":"kiali-operator"}
{"level":"info","ts":1600429182.4208694,"logger":"cmd","msg":"Could not create ServiceMonitor object","Namespace":"","error":"no ServiceMonitor registered with the API"}
{"level":"info","ts":1600429182.4210007,"logger":"cmd","msg":"Install prometheus-operator in your cluster to create ServiceMonitor objects","Namespace":"","error":"no ServiceMonitor registered with the API"}
{"level":"info","ts":1600429182.4220192,"logger":"proxy","msg":"Starting to serve","Address":"127.0.0.1:8888"}
{"level":"info","ts":1600429182.4223576,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
{"level":"info","ts":1600429182.422396,"logger":"controller-runtime.controller","msg":"Starting EventSource","controller":"kiali-controller","source":"kind source: kiali.io/v1alpha1, Kind=Kiali"}
{"level":"info","ts":1600429182.5228758,"logger":"controller-runtime.controller","msg":"Starting Controller","controller":"kiali-controller"}
{"level":"info","ts":1600429182.522922,"logger":"controller-runtime.controller","msg":"Starting workers","controller":"kiali-controller","worker count":1}
And then I created the following Kiali CR in the istio-system namespace:
apiVersion: kiali.io/v1alpha1
kind: Kiali
metadata:
name: kiali
annotations:
ansible.operator-sdk/verbosity: "1"
spec:
istio_component_namespaces:
prometheus: monitoring-system
grafana: monitoring-system
istio_namespace: istio-system
api:
namespaces:
exclude:
- "*-system"
auth:
strategy: "anonymous"
deployment:
accessible_namespaces: "**"
ingress_enabled: false
service_type: ClusterIP
view_only_mode: true
which resulted in the following error in Kiali CR's status and deployment failed:
The operator cannot support deployment.accessible_namespaces set to '**'
because it does not have permissions to create clusterroles or clusterrolebindings
Similar error could be seen in the operator log:
TASK [default/kiali-deploy : debug] ********************************************
ok: [localhost] => {
"msg": "IMAGE_NAME=quay.io/kiali/kiali; IMAGE VERSION=v1.23.0; VERSION LABEL=v1.23.0"
}
TASK [default/kiali-deploy : Determine the Role and RoleBinding kinds that the operator will create and that the role templates will use] ***
ok: [localhost] => {"ansible_facts": {"role_binding_kind": "ClusterRoleBinding", "role_kind": "ClusterRole"}, "changed": false}
TASK [default/kiali-deploy : Determine if the operator can support accessible_namespaces=**] ***
fatal: [localhost]: FAILED! => {"changed": false, "msg": "The operator cannot support deployment.accessible_namespaces set to '**' because it does not have permissions to create clusterroles or clusterrolebindings"}
PLAY RECAP *********************************************************************
localhost : ok=37 changed=1 unreachable=0 failed=1 skipped=34 rescued=0 ignored=1
-------------------------------------------------------------------------------
{"level":"error","ts":1600430437.0781507,"logger":"controller-runtime.controller","msg":"Reconciler error","controller":"kiali-controller","request":"istio-system/kiali","error":"event runner on failed","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\tpkg/mod/github.com/go-logr/[email protected]/zapr.go:128\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:258\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\tpkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\tpkg/mod/k8s.io/[email protected]/pkg/util/wait/wait.go:88"}
Versions used
Kiali: 1.23.0
Istio: 1.7.1
Kubernetes flavour and version:
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.7", GitCommit:"b4455102ef392bf7d594ef96b97a4caa79d729d9", GitTreeState:"clean", BuildDate:"2020-06-17T11:32:20Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
I can confirm that the following resources are present on the cluster:
$ k get clusterrole kiali-kiali-operator -n kiali-operator
NAME AGE
kiali-kiali-operator 46m
$ k get clusterrolebindings kiali-kiali-operator -n kiali-operator
NAME AGE
kiali-kiali-operator 46m
$ k get sa -n kiali-operator
NAME SECRETS AGE
default 1 46m
kiali-kiali-operator 1 46m
the kiali-kiali-operator clusterrole has rights to create both clusterroles and clusterrolebindings:
...
- apiGroups:
- rbac.authorization.k8s.io
resources:
- clusterrolebindings
- clusterroles
- rolebindings
- roles
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
...
and the clusterrolebindings binds the correct serviceaccount to the correct clusterrole:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
...
labels:
app: kiali-operator
app.kubernetes.io/instance: kiali
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: kiali-operator
app.kubernetes.io/version: v1.23.0
helm.sh/chart: kiali-operator-1.23.0
version: v1.23.0
name: kiali-kiali-operator
...
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: kiali-kiali-operator
subjects:
- kind: ServiceAccount
name: kiali-kiali-operator
namespace: kiali-operator
Also the following checks give the expected result:
$ k auth can-i create clusterrolebindings --as=system:serviceaccount:kiali-operator:kiali-kiali-operator
Warning: resource 'clusterrolebindings' is not namespace scoped in group 'rbac.authorization.k8s.io'
yes
$ k auth can-i create clusterroles --as=system:serviceaccount:kiali-operator:kiali-kiali-operator
Warning: resource 'clusterroles' is not namespace scoped in group 'rbac.authorization.k8s.io'
yes
Maybe I am still misinterpreting the error message: The operator cannot support deployment.accessible_namespaces set to '**' because it does not have permissions to create clusterroles or clusterrolebindings but if not, then I think this is a bug.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I would have expected that the Operator could deploy Kiali on the istio-system namespace.
The Operator is looking for a
clusterrolekiali-operator
This is probably not good. I would prefer the operator not look for a specifically named role because the name could be different under certain circumstances (I think if you install via helm-chart, you might be able to change the name of the clusterrole). So I'm thinking I should probably change that query to look for a specific label rather than a specific name (I'll have to check the code - the labels might not be specific either).
What was the actual command you uses to install the operator? (I'm thinking you changed the value nameOverride or fullnameOverride??)
It is actually ArgoCD deploying it so I override the name as releaseName: kiali-operator in the ArgoCD Application CR and that fixed my problem.
I'm going to reopen this issue - I would like to see if I could get the operator to not key on a specific name of the resource, but instead figure out another way to determine if it has the proper permissions.
Maybe you can imitate kubectl auth can-i as the service account of the operator using K8s client?
This code needs to be refactored so it doesn't look for a specific named role but instead we should do a "can-i" type query: https://github.com/kiali/kiali-operator/blob/master/roles/v1.24/kiali-deploy/tasks/main.yml#L433-L442
If I can refactor this, we can close this PR https://github.com/kiali/kiali-operator/pull/133 since it is not needed.