Describe the bug
After removing prometheus-operator helm chart from Kubernetes cluster, the kubelet scraper service objects still exist. This leads to duplicate/triplicate scraped values on next operator installations and causes much confusion.
Version of Helm and Kubernetes:
$ helm version
Client: &version.Version{SemVer:"v2.14.3", GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
Server: &version.Version{SemVer:"v2.14.3", GitCommit:"0e7f3b6637f7af8fcfddb3d2941fcc7cbebb0085", GitTreeState:"clean"}
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.3", GitCommit:"2d3c76f9091b6bec110a5e63777c332469e0cba2", GitTreeState:"clean", BuildDate:"2019-08-19T11:13:54Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.6", GitCommit:"96fac5cd13a5dc064f7d9f4f23030a6aeface6cc", GitTreeState:"clean", BuildDate:"2019-08-19T11:05:16Z", GoVersion:"go1.12.9", Compiler:"gc", Platform:"linux/amd64"}
md5-b80aff89d74f023d5ec16664eed46c4f
$ helm ls prometheus-operator
NAME REVISION UPDATED STATUS CHART APP VERSION NAMESPACE
prometheus-operator 2 Fri Oct 18 09:24:07 2019 DEPLOYED prometheus-operator-6.18.0 0.32.0 monitoring
md5-bfe9027fdcfb9e257b5ecbb04e1c4f2a
$ kg svc -n kube-system -l k8s-app=kubelet
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
halting-stoat-prometheus-o-kubelet ClusterIP None <none> 10250/TCP 48d
monitoring-prometheus-oper-kubelet ClusterIP None <none> 10250/TCP 48d
prometheus-operator-kubelet ClusterIP None <none> 10250/TCP 48d
md5-574346e1dae529fbd96995ec27fac056
$: helm install stable/prometheus-operator --name test-install
$: helm delete test-install --purge
$: kubectl get service -n kube-system -l k8s-app=kubelet
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-install-prometheus-op-kubelet ClusterIP None <none> 10250/TCP 8m15s
Anything else we need to know:
Already reported before: https://github.com/helm/charts/issues/14595
and mentioned here: https://github.com/helm/charts/issues/13669
This service is created dynamically by coreos/prometheus-operator and not managed by the helm chart. Unfortunately there is nothing that can be done to clean this up from the helm perspective.
OK, I see, this is not managed by helm chart like other services, but there should be some way to address this.
Can the creation procedure be improved to check if there's already an existing service?
Can there be Helm 3 improvements that would help to deal with this better?
If there appears to be no way to handle this, can you please adjust the 'Removal' procedure in the README and add instructions to remove it manually please?
Will see what can be done!
I've experimented with a cleanup process but the more I look at this the more it looks like Helm 2 does not allow a post-delete cleanup of the kind necessary to get this done properly. I'm afraid only documentation is possible in this case.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.
This issue is being automatically closed due to inactivity.
i have save it.
example
kubectl get svc -n kube-system -l k8s-app=kubelet
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
halting-stoat-prometheus-o-kubelet ClusterIP None
monitoring-prometheus-oper-kubelet ClusterIP None
prometheus-operator-kubelet ClusterIP None
Keep the name of your recent installation,Delete others
@vsliouniaev Documentation will be really helpful on this. We burnt through several hours trying figure out a double counting issue as a result of a previously enabled kubelet service in another prometheus deployment.
@sameerbhadouria i've tried to delete ALL kubelet service and kubelet endpoints (hoping that prometheus-operator recreate them)
But this doesn't happen and now i haven't them in the cluster..
The operator parameters in input are:
Args:
│ --kubelet-service=kube-system/kubelet
│ --logtostderr=true
│ --config-reloader-image=quay.io/coreos/configmap-reload:v0.0.1
│ --prometheus-config-reloader=quay.io/coreos/prometheus-config-reloader:v0.29.0
│ --log-level=all
Do you have any tips about that?
@pie-r What version of the chart are you using?
I didn't face issue recreating the services. The creation of Kubelet service is enabled by default, make sure you have it enabled. Also, make sure you are looking at the right namespace (kube-system) for the service. In some dashboards (for eg: GKE), they have a filter to hide system objects Is system object: False, and you won't see kube-system services if you have this filter enabled.
Next I would try to upgrade the release version with helm upgrade. Worst case if you are still testing this setup, may be just delete the helm release deployment. Make sure to also delete the CRDs.
kubectl delete crd prometheuses.monitoring.coreos.com
kubectl delete crd prometheusrules.monitoring.coreos.com
kubectl delete crd servicemonitors.monitoring.coreos.com
kubectl delete crd podmonitors.monitoring.coreos.com
kubectl delete crd alertmanagers.monitoring.coreos.com
kubectl delete crd thanosrulers.monitoring.coreos.com
Hope this helps!
@pie-r What version of the chart are you using?
I didn't face issue recreating the services. The creation of Kubelet service is enabled by default, make sure you have it enabled. Also, make sure you are looking at the right namespace (kube-system) for the service. In some dashboards (for eg: GKE), they have a filter to hide system objectsIs system object: False, and you won't see kube-system services if you have this filter enabled.Next I would try to upgrade the release version with
helm upgrade. Worst case if you are still testing this setup, may be just delete the helm release deployment. Make sure to also delete the CRDs.kubectl delete crd prometheuses.monitoring.coreos.com kubectl delete crd prometheusrules.monitoring.coreos.com kubectl delete crd servicemonitors.monitoring.coreos.com kubectl delete crd podmonitors.monitoring.coreos.com kubectl delete crd alertmanagers.monitoring.coreos.com kubectl delete crd thanosrulers.monitoring.coreos.comHope this helps!
Thank you for your help and fast reply!
I'm not using helm but I come here because the issue was very similar. I tried a lot of thing:
kubectl -n kube-system edit endpoints kubelet kubectl -n kube-system delete svc kubelet kubeletkubectl -n kube-system delete endpoints kubeletArgs:
│ --kubelet-service=kube-system::kubelet
--kubelet-service from the ARGS `Args:
│ --kubelet-service=kube-system/kubelet
Nothing trigger the recreation of service and endpoint..
I also read the source code and it seems very easy, each 3 minutes it tries to createOrUpdateEnpoints and also Service. But it should be enabled setting it in the ARGS the fals --kubelet-service. It breaks after upgrading from GKE 1.15 to 1.16.
/reopen
Same problem for me, please could you /reopen ?
Most helpful comment
This service is created dynamically by coreos/prometheus-operator and not managed by the helm chart. Unfortunately there is nothing that can be done to clean this up from the helm perspective.