deleting pvc fails to recycle pv
oc v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://127.0.0.1:8443
openshift v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7
oc login -u admin:system):NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
pv0005 100Gi RWO,ROX,RWX Recycle Failed kafka-dev/datadir-kafka-1 11m
Name: pv0005
Labels: volume=pv0005
Annotations: pv.kubernetes.io/bound-by-controller=yes
StorageClass:
Status: Failed
Claim: kafka-dev/datadir-kafka-1
Reclaim Policy: Recycle
Access Modes: RWO,ROX,RWX
Capacity: 100Gi
Message: Recycle failed: unexpected error creating recycler pod: pods "recycler-for-pv0013" is forbidden: service account openshift-infra/pv-recycler-controller was not found, retry after the service account is created
Source:
Type: HostPath (bare host directory volume)
Path: /var/lib/origin/openshift.local.pv/pv0005
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
5m 5m 1 persistentvolume-controller Warning VolumeFailedRecycle Recycle failed: unexpected error creating recycler pod: pods "recycler-for-pv0013" is forbidden: service account openshift-infra/pv-recycler-controller was not found, retry after the service account is created
I did an oc cluster down, then moved /var/lib/origin to .old and rebooted. Same issue.
However oc cluster down/up shows all pv's as fine/available when the cluster comes up.
Using ubuntu 16.04 LTS up to date with packages.
I would origin to just recycle pv's automatically, and there should be no need for me to do anything?
Here is the template to create the setup. You can do oc new-app -f <template> to create it:
apiVersion: v1
kind: Template
metadata:
name: kafka
annotations:
# openshift.io/display-name: "Kafka Container Cluster"
description: "Kafka"
iconClass: "icon-openjdk"
tags: "kafka,zookeeper"
objects:
# A headless service to create DNS records
- apiVersion: v1
kind: Service
metadata:
annotations:
service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
name: broker
spec:
ports:
- port: 9092
# [podname].broker.kafka.svc.cluster.local
clusterIP: None
selector:
app: kafka
# The real service
- apiVersion: v1
kind: Service
metadata:
name: kafka
spec:
ports:
- port: 9092
selector:
app: kafka
- apiVersion: apps/v1beta1
kind: StatefulSet
metadata:
name: kafka
spec:
serviceName: "broker"
replicas: ${REPLICAS}
template:
metadata:
labels:
app: kafka
annotations:
pod.alpha.kubernetes.io/initialized: "true"
spec:
containers:
- name: broker
image: ${IMAGE}
ports:
- containerPort: 9092
command:
- sh
- -c
- "./bin/kafka-server-start.sh config/server.properties --override broker.id=$(hostname | awk -F'-' '{print $2}')"
volumeMounts:
- name: datadir
mountPath: /opt/kafka/data
volumeClaimTemplates:
- metadata:
name: datadir
annotations:
volume.alpha.kubernetes.io/storage-class: anything
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: ${PVC_SIZE}
- apiVersion: v1
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: datadir-kafka-0
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${PVC_SIZE}
- apiVersion: v1
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: datadir-kafka-1
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${PVC_SIZE}
- apiVersion: v1
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: datadir-kafka-2
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${PVC_SIZE}
- apiVersion: v1
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: datadir-kafka-3
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${PVC_SIZE}
- apiVersion: v1
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: datadir-kafka-4
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: ${PVC_SIZE}
parameters:
- description: Number of kafka pods (max 5; only 5 pvcs in the template)
name: REPLICAS
value: '3'
- description: datadir-kafka pvc size
name: PVC_SIZE
value: 100Mi
- description: Kafka container image
name: IMAGE
value: spicysomtam/kafka:0.10
labels:
template: kafka
I am having this issue as well. Upgrading a v1.5 cluster to v3.6 was fine but a new v3.6 cluster does not appear to have the correct service account for the pv recycler
It looks like there was a move of the openshift controller roles that might not have been reflected in the recycler? That would me that upgraded clusters would still have the old roles and service accounts.
714f56a3aa75f047a05fd12fe3beb577417b6879
Ran into the same issue after moving from 1.4 to 3.6. Resolved it by creating the missing service account
$ oc describe pv FAILED_PV
...
Events:
FirstSeen LastSeen Count From SubObjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
13m 13m 1 persistentvolume-controller Warning VolumeFailedRecycle Recycle failed: unexpected error creating recycler pod: pods "recycler-for-pv007" is forbidden: service account openshift-infra/pv-recycler-controller was not found, retry after the service account is created
oc create serviceaccount pv-recycler-controller -n openshift-infra
@alexcreek I did the same thing but my PVs are still in Failed state.
oc get ClusterRoles | grep persistent-
oc get clusterrolebindings | grep pv-recycler-controller
oadm policy add-cluster-role-to-user system:persistent-volume-provisioner pv-recycler-controller
oadm policy add-cluster-role-to-user system:controller:persistent-volume-binder pv-recycler-controller
Still no luck..
Edit:
Those PVs were claimed by some PVCs that were all deleted during a project wipeout (oc delete all --all). I don't yet know why pv-recycler-controller didn't clear them but for anyone trying to get their PVs free again, this is how I made my PVs "available". I edited each one individually with "oc edit pv pv_name", deleted "claimRef" section in the YAML files. Then they were "available" again.
Edit2:
I have created a CASE for this in Red Hat. Will be posting updates here.
+1 for the oc create serviceaccount pv-recycler-controller -n openshift-infra resolving the issue... OpenShift Project as a whole seems pretty buggy... Also had missing directions on NFS PV create to CHMOD it before using it.
Using following command, solve my problem.
oc create serviceaccount pv-recycler-controller -n openshift-infra
But should it provisioned with the ansible install?
My Version:
oc v3.6.1+008f2d5
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://centos-51:8443
kubernetes v1.6.1+5115d708d7
It solved my issue too! Thanks.
Using v3.6.0+c4dd4cf
Issues go stale after 90d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.
If this issue is safe to close now please do so with /close.
/lifecycle rotten
/remove-lifecycle stale
Rotten issues close after 30d of inactivity.
Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.
/close
/reopen
@ginigangadharan: You can't reopen an issue/PR unless you authored it or you are a collaborator.
In response to this:
/reopen
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
Most helpful comment
+1 for the oc create serviceaccount pv-recycler-controller -n openshift-infra resolving the issue... OpenShift Project as a whole seems pretty buggy... Also had missing directions on NFS PV create to CHMOD it before using it.