Operator-sdk: Failed to list *v1.ServiceMonitor: v1.ListOptions is not suitable for converting to "monitoring.coreos.com/v1"

Created on 17 Sep 2019 · 8Comments · Source: operator-framework/operator-sdk

Bug Report

When trying to Get an instance of a ServiceMonitor, the following is seen on the logs every second:

E0917 14:23:29.341637    6838 reflector.go:134] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:95: Failed to list *v1.ServiceMonitor: v1.ListOptions is not suitable for converting to "monitoring.coreos.com/v1" in scheme "pkg/runtime/scheme.go:101"

What did you do?

Install the Prometheus Operator
Install a Prometheus CR, like:

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: prometheus

Run make run-debug from the branch: https://github.com/jpkrohling/opentelemetry-operator/tree/Add-Service-Monitor
Run kubectl apply -f https://raw.githubusercontent.com/jpkrohling/opentelemetry-operator/Add-Service-Monitor/examples/simplest.yaml

This the code that attempts to retrieve the service monitors and that triggers the error:

existing := &monitoringv1.ServiceMonitor{}
err := r.client.Get(ctx, types.NamespacedName{Name: desired.Name, Namespace: desired.Namespace}, existing)

Note that the code never returns from the Get call, even when explicitly setting a context.WithTimeout(time.Second).

Environment

operator-sdk version: v0.10.0, commit: ff80b17737a6a0aade663e4827e8af3ab5a21170
go version: go1.12.9 linux/amd64
Kubernetes version information:

Client Version: version.Info{Major:"1", Minor:"10+", GitVersion:"v1.10.0+d4cacc0", GitCommit:"d4cacc0", GitTreeState:"clean", BuildDate:"2018-12-06T15:15:06Z", GoVersion:"go1.11.2", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.2", GitCommit:"f6278300bebbb750328ac16ee6dd3aa7d3549568", GitTreeState:"clean", BuildDate:"2019-08-05T09:15:22Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}

Kubernetes cluster kind: minikube
Are you writing your operator in ~ansible, helm,~ or go?

Possible Solution
Use the rest client from the monitoring package (github.com/coreos/prometheus-operator/pkg/client/versioned/typed/monitoring/v1) instead of the client from the controller-runtime package (sigs.k8s.io/controller-runtime/pkg/client) sounds like a possible solution, but I'm yet to try that out. If that's the official workaround, the question is then: should we start using rest.Client (k8s.io/client-go/rest) everywhere instead of controller-runtime's client?

kinbug lifecyclrotten triagneeds-information

Source

jpkrohling

Most helpful comment

https://github.com/kubernetes-sigs/controller-runtime/pull/580

I am actively attempting to fix the timeout issue.

shawn-hurley on 17 Sep 2019

👍2

All 8 comments

https://github.com/kubernetes-sigs/controller-runtime/pull/580

I am actively attempting to fix the timeout issue.

shawn-hurley on 17 Sep 2019

👍2

Any news on this? I wanted to clarify whether a fix for this is planned for the next release, or if I should invest some time in a workaround.

jpkrohling on 24 Sep 2019

Is there any recommendation at all about a workaround?

jpkrohling on 2 Oct 2019

I decided to not use the controller-runtime's client and I'm moving to the client-go REST. Therefore, feel free to close this.

It would have been nice to get an official guidance from the Operator SDK team, though.

jpkrohling on 11 Oct 2019

Hi @jpkrohling,

I am not able to create a POC to help you with now (it would take a while), but following the steps which I would perform to check it.

Install the Prometheus Operator: https://github.com/coreos/kube-prometheus
Create the Memcached Getting Started POC example: https://github.com/operator-framework/getting-started using the latest version 0.11.0
NOTE: Ensure that you apply the RBCA roles and files with the admin user
Ensure that the project in the main.go has the required impl to create the Service and MonitorService for the operator as for example here
Then, install all in the Minikube and ensure that is all running with success.
Get the name of kubectl get ServiceMonitor -n <name> created
Then to check it by the following GET example in the Reconcile :

         import (
               monitoringv1 "github.com/coreos/prometheus-operator/pkg/apis/monitoring/v1"
         )

         ...
    serviceMonitor := &monitoringv1.ServiceMonitor{}
    err = r.client.Get(context.TODO(), types.NamespacedName{Name: <name of the service monitor found by `kubectl get ServiceMonitor -n <name>`>, Namespace: memcached.namespace }, serviceMonitor)
    ....
    } else if err != nil {
        reqLogger.Error(err, "Failed to get Service.")
        return reconcile.Result{}, err
    }

So, It probably will not work and may face the issue described here since the schema was not added in the project which would confirm that the issue is because of it. See here how to add the schema.

So, I would add the ServiceMonitor schema in the main.go and check if it was solved or not.

Note SDK 0.11.0 is using an upper version of the controller-runtime which could have some related bug fixed as well. In this way, If you still facing the issue after all, maybe the solution would be help @shawn-hurley to move forward with https://github.com/kubernetes-sigs/controller-runtime/pull/580 or do not use the controller-runtime client as an workaround for now.

I hope that it helps you with.

camilamacedo86 on 15 Oct 2019

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot on 13 Jan 2020

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot on 12 Feb 2020

Hi @jpkrohling,

I am closing this one since we are unable to reproduce this scenario as you can check in the comment https://github.com/operator-framework/operator-sdk/issues/1927#issuecomment-542166779 made at 15 Oct 2019 and no further information was sent for we are able to check it.

However, please feel free to raise a new issue if you sill needing helping with.

camilamacedo86 on 10 Mar 2020

Was this page helpful?

0 / 5 - 0 ratings