Operator-sdk: Golang example on OpenShift 4.3 cannot produce deployments (finalizer RBAC error)

Created on 21 Jul 2020  路  12Comments  路  Source: operator-framework/operator-sdk

Bug Report

What did you do?
Followed https://sdk.operatorframework.io/docs/golang/quickstart/

What did you expect to see?

Memcached pods appearing

What did you see instead? Under which circumstances?

2020-07-21T08:47:07.870Z        INFO    controllers.Memcached   Creating a new Deployment       {"memcached": "default/memcached-sample", "Deployment.Namespace": "default", "Deployment.Name": "memcached-sample"}
2020-07-21T08:47:07.877Z        ERROR   controllers.Memcached   Failed to create new Deployment {"memcached": "default/memcached-sample", "Deployment.Namespace": "default", "Deployment.Name": "memcached-sample", "error": "deployments.apps \"memcached-sample\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}

No new pods. Above error comes from the logs of my operator.

Environment

  • operator-sdk version:
    0.19
  • go version:
    1.14.5
  • Kubernetes version information:
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"16+", GitVersion:"v1.16.2", GitCommit:"a02f27a", GitTreeState:"clean", BuildDate:"2020-04-13T12:04:13Z", GoVersion:"go1.12.12", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes cluster kind:

OpenShift 4.3

  • Are you writing your operator in ansible, helm, or go?

Go

Possible Solution

Missing rbac in role.yaml surely, missing annotations somewhere to allow finalizer permissions?

Here's my generated role.yaml:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  creationTimestamp: null
  name: manager-role
rules:
- apiGroups:
  - apps
  resources:
  - deployments
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - cache.example.com
  resources:
  - memcacheds
  verbs:
  - create
  - delete
  - get
  - list
  - patch
  - update
  - watch
- apiGroups:
  - cache.example.com
  resources:
  - memcacheds/status
  verbs:
  - get
  - patch
  - update
- apiGroups:
  - ""
  resources:
  - pods
  verbs:
  - get
  - list

My reconcile method as per the example:

// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/status,verbs=get;update;patch
// +kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
// +kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;

func (r *MemcachedReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
kindocumentation triagneeds-information

All 12 comments

@a-roberts can you try adding the following marker and running make deploy again?

// +kubebuilder:rbac:groups=apps,resources=deployments/finalizers,verbs=get;create;update;patch;delete

HI @a-roberts,

It shows duplicated of https://github.com/operator-framework/operator-sdk/issues/3590. See my comment in https://github.com/operator-framework/operator-sdk/issues/3590#issuecomment-665645542.

I am closing this one and I'd like to ask for you follow up the https://github.com/operator-framework/operator-sdk/issues/3590 if possible. However, if you do believes that it should be re-opened for any reason please feel free to ping and let us know.

Reopening after I hit the same error while testing on Openshift 4.5.6:

ERROR   controllers.Memcached   Failed to create new Deployment {"memcached": "default/memcached-sample", "Deployment.Namespace": "default", "Deployment.Name": "memcached-sample", "error": "deployments.apps \"memcached-sample\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}

The fix for me was to give the operator's ClusterRole permissions to update the memcached CR's finalizers, with the following marker in the controller:

// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/finalizers,verbs=get;update;patch
- apiGroups:
  - cache.example.com
  resources:
  - memcacheds/finalizers
  verbs:
  - get
  - patch
  - update

@estroz @camilamacedo86 @joelanford Sorry if I missed it but did we discuss why we don't have the above in our docs and sample memcached controller to make the quickstart guide work by default on openshift (or any cluster with OwnerReferencesPermissionsEnforcement set).

https://sdk.operatorframework.io/docs/building-operators/golang/tutorial/#specify-permissions-and-generate-rbac-manifests

I know there's similar discussion for the same bug for Helm operators in https://github.com/operator-framework/operator-sdk/issues/3767#issuecomment-679123829 which I think is for the same reason.

Wouldn't call this a bug but definitely something to fix in the docs to make our quickstart example work on openshift by default.

This came up for the Helm operator as well in https://github.com/operator-framework/operator-sdk/issues/3767

What is it about OpenShift that causes this error that doesn't happen in vanilla Kubernetes?

I'm curious if this is an OpenShift-specific API server customization, or if its an extra admission plugin that is not enabled by default in vanilla Kubernetes?

The reason I ask: Should we push to get this permission added to the default scaffold in Kubebuilder?

Found it: https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#ownerreferencespermissionenforcement

Given that this is built into upstream Kubernetes, I think we could make a case for including this in Kubebuilder's scaffolding. WDYT?

Yeah that admission controller seems to be a reasonable default to have in any non-vanilla cluster.
So, not unreasonable to have the kubebuilder controller scaffold have the <kind>/finalizers marker comment as well.

But in the meanwhile we should probably update our docs and sample controller to include a note on this so that our example works on openshift or any non-vanilla cluster.

Hello,

fyi, just ran into the same problem generating/installing the Go MemcachedStatus operator provided in https://github.com/operator-framework/operator-sdk/tree/master/testdata with.

operator-sdk version: "v1.1.0", commit: "9d27e224efac78fcc9354ece4e43a50eb30ea968", kubernetes version: "v1.18.2", go version: "go1.15 linux/amd64", GOOS: "linux", GOARCH: "amd64"

and using OCP:

oc version
Client Version: 4.5.17
Server Version: 4.5.17
Kubernetes Version: v1.18.3+45b9524

Hi @fckbo,

The testdata/go/memcached operator has not the finalizer permission required for it works in OCP:

// +kubebuilder:rbac:groups=cache.example.com,resources=memcacheds/finalizers,verbs=get;update;patch

See:

https://github.com/operator-framework/operator-sdk/blob/master/testdata/go/memcached-operator/controllers/memcached_controller.go#L45-L48

We added this permission in upstream https://github.com/kubernetes-sigs/kubebuilder/pull/1688 but for v3+ plugin which means that it will ONLY be available to SDK when this plugin version is supported here whcih is not the current case.

So, you can add the permission and run make manifests to gen the RBAC, build the project and test it in OCP. However, I think we can mock the permission in the example as well. See; https://github.com/operator-framework/operator-sdk/pull/4162/files.

Hi @camilamacedo86,

thx for your answer, this is what I had done....it worked... and sorry, I actually did not realise that the fix would be included only in a future release. Thx for clarifying.

It is fine and really thank you for your collaboration 馃

Was this page helpful?
0 / 5 - 0 ratings