Origin: Failing initialization of containers when creating statefulset zookeeper

Created on 22 Mar 2017  路  25Comments  路  Source: openshift/origin

When creating statefulset zookeeper from origin/examples/statefulsets/zookeeper/ a first pod fails to start and it is stuck on init state. I assume it fails on init-containers when installing zookeeper.

Version
# openshift version
openshift v1.4.1
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0
Steps To Reproduce
  1. oc new-project zookeeper
  2. oc create -f volume.yaml
kind: PersistentVolume
apiVersion: v1
metadata:
  name: pvc-datadir-zoo-0
  labels:
    type: local
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp"
  1. oc create -f zookeeper.yaml (without volume.alpha.kubernetes.io/storage-class)
Current Result

Container is stuck at:
container "zk" in pod "zoo-0" is waiting to start: PodInitializing

in logs I can see:

installing config scripts into /work-dir
installing zookeeper-3.5.0-alpha into /opt
gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
mv: cannot stat '/opt/zookeeper-3.5.0-alpha': No such file or directory
cp: cannot stat '/opt/zookeeper/conf/zoo_sample.cfg': No such file or directory
zookeeper-3.5.0-alpha supports dynamic reconfiguration, enabling it
/install.sh: line 66: /opt/zookeeper/conf/zoo.cfg: No such file or directory
/install.sh: line 67: /opt/zookeeper/conf/zoo.cfg: No such file or directory
copying nc into /opt
2017/03/22 13:19:18 lookup zk on 192.168.122.254:53: read udp 172.17.0.3:36608->192.168.122.254:53: read: no route to host
2017/03/22 13:19:24 lookup zk on 192.168.122.254:53: read udp 172.17.0.3:47656->192.168.122.254:53: read: no route to host
2017/03/22 13:19:30 lookup zk on 192.168.122.254:53: read udp 172.17.0.3:34550->192.168.122.254:53: read: no route to host
2017/03/22 13:19:36 lookup zk on 192.168.122.254:53: read udp 172.17.0.3:43606->192.168.122.254:53: read: no route to host
2017/03/22 13:19:42 lookup zk on 192.168.122.254:53: read udp 172.17.0.3:49202->192.168.122.254:53: read: no route to host
2017/03/22 13:19:53 lookup zk on 192.168.122.254:53: read udp 172.17.0.3:55644->192.168.122.254:53: i/o timeout
# oc get pv
NAME                CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS    CLAIM                     REASON    AGE
pvc-datadir-zoo-0   20Gi       RWO           Retain          Bound     zookeeper/datadir-zoo-2             6m
pvc-datadir-zoo-1   20Gi       RWO           Retain          Bound     zookeeper/datadir-zoo-0             6m
pvc-datadir-zoo-2   20Gi       RWO           Retain          Bound     zookeeper/datadir-zoo-1             6m

# oc get pvc
NAME            STATUS    VOLUME              CAPACITY   ACCESSMODES   AGE
datadir-zoo-0   Bound     pvc-datadir-zoo-1   20Gi       RWO           8m
datadir-zoo-1   Bound     pvc-datadir-zoo-2   20Gi       RWO           8m
datadir-zoo-2   Bound     pvc-datadir-zoo-0   20Gi       RWO           8m

Result of #13168

componenkubernetes kinbug lifecyclrotten prioritP2

All 25 comments

 hostPath:
      path: "/tmp"

That looks suspicious. I don't know all your PVs, but if two or more such PVs use /tmp and they end up in pods on the same node they may rewrite each others data. I am not sure if that's the case here, just stay alert.

I have edited yaml to:

kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv-datadir-zoo-0
  labels:
    type: local
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/zoo-0"

Now all my PVs, if i get it right, have different storage locations.

My issue is still on table :-( .

I initially thought it's an issue with hostPath mounts, I've tried applying this rule but that still hasn't solved my problem. Still looking into it...

@Tiboris based on this stateful set doc and after talking with some folks during kubecon I'm afraid to say that it's not yet possible to use StatefulSet with hostPath mounts. It's in the plan, but currently only dynamically provisioned volumes are available 馃槥

@Tiboris Did you try to comment volume.alpha.kubernetes.io/storage-class: anything annotation on PVC? AFIU it triggers usage of dynamic provisioning while you want to use static PVs.

I was successfully using PetSets with hostPath a couple month ago, so I would be surprised if now, with StatefulSets it doesn't work anymore.

@Tiboris Ok, I've re-read the issue and see that you've tried it. Could you try MongoDB example? If it will work then probably there is a bug that is related to using init-containers with StatefulSets.

@php-coder Thank you, I will try it definitely.

@soltysh I must have somehow missed that line in _'limitations'_ 馃槙
But, I think, there is understandable reason that I have tried to use hostPaths mounts instead of dynamic provisioned ones. The #13168 issue appeared just after I had been unable to create statefulset zookeeper using:

oc create -f https://raw.githubusercontent.com/openshift/origin/master/examples/statefulsets/zookeeper/zookeeper.yaml

as I was just following README.md:

Create the statefulset in this directory
$ kubetl create -f zookeeper.yaml

it shows:

_SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "datadir-zoo-0", which is unexpected._

and the # oc get pvc command shows:

NAME            STATUS    VOLUME    CAPACITY   ACCESSMODES   AGE
datadir-zoo-0   Pending                                      9m
datadir-zoo-1   Pending                                      9m
datadir-zoo-2   Pending                                      9m

Is there something wrong with my setup or what exactly is actually causing not provisioning volumes?

SchedulerPredicates failed due to PersistentVolumeClaim is not bound: "datadir-zoo-0", which is unexpected.

oc describe pvc should tell you what's wrong, most probably alpha dynamic provisioning failed. If you don't want dynamic provisioning, remove volume.alpha.kubernetes.io/storage-class annotation from the PVC template in your stateful set and OpenShift will try to find an existing PV. Alpha provisioning is kind of counter-intuitive, it always provisions a new volume even though there are existing PVs that could be used.

So # oc describe pvc results in:

Name:       datadir-zoo-0
Namespace:  zoo
StorageClass:   anything
Status:     Pending
Volume:     
Labels:     app=zk
Capacity:   
Access Modes:   
Events:
  FirstSeen LastSeen    Count   From                SubobjectPath   Type        Reason          Message
  --------- --------    -----   ----                -------------   --------    ------          -------
  15s       3s      2   {persistentvolume-controller }          Warning     ProvisioningFailed  cannot find volume plugin for alpha provisioning


Name:       datadir-zoo-1
Namespace:  zoo
StorageClass:   anything
Status:     Pending
Volume:     
Labels:     app=zk
Capacity:   
Access Modes:   
Events:
  FirstSeen LastSeen    Count   From                SubobjectPath   Type        Reason          Message
  --------- --------    -----   ----                -------------   --------    ------          -------
  14s       3s      2   {persistentvolume-controller }          Warning     ProvisioningFailed  cannot find volume plugin for alpha provisioning


Name:       datadir-zoo-2
Namespace:  zoo
StorageClass:   anything
Status:     Pending
Volume:     
Labels:     app=zk
Capacity:   
Access Modes:   
Events:
  FirstSeen LastSeen    Count   From                SubobjectPath   Type        Reason          Message
  --------- --------    -----   ----                -------------   --------    ------          -------
  14s       3s      2   {persistentvolume-controller }          Warning     ProvisioningFailed  cannot find volume plugin for alpha provisioning

@soltysh obviously something is wrong, and without annotation volume.alpha.kubernetes.io/storage-class and without created volumes I get:
FailedBinding no persistent volumes available for this claim and no storage class is set
So i assume it is expected because I have no volumes.

It is in limitations and you said it too:

it's not yet possible to use StatefulSet with hostPath mounts.

So my next question is:
How do I create volumes without hostPath with hope it will get bound with claims?

@php-coder I have tried mongodb example
with volumes

kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv-datadir-zoo-0 
  labels:
    type: local
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  nfs:
    path: /tmp/zoo0
    server: 127.0.0.1

and


kind: PersistentVolume
apiVersion: v1
metadata:
  name: pv-datadir-zoo-3 
  labels:
    type: local
spec:
  capacity:
    storage: 20Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/zoo-3"

but get:

# oc get pv
NAME               CAPACITY   ACCESSMODES   RECLAIMPOLICY   STATUS      CLAIM     REASON    AGE
pv-datadir-zoo-0   20Gi       RWO           Retain          Available                       1m
pv-datadir-zoo-1   20Gi       RWO           Retain          Available                       1m
pv-datadir-zoo-2   20Gi       RWO           Retain          Available                       1m
pv-datadir-zoo-3   20Gi       RWO           Retain          Available                       1m
#  oc new-app https://raw.githubusercontent.com/sclorg/mongodb-container/master/examples/petset/mongodb-petset-persistent.yaml
--> Deploying template "mongodb/mongodb-petset-replication" for "https://raw.githubusercontent.com/sclorg/mongodb-container/master/examples/petset/mongodb-petset-persistent.yaml" to project mongodb

     mongodb-petset-replication
     ---------
     MongoDB Replication Example (based on StatefulSet). You must have persistent volumes available in your cluster to use this template.

     * With parameters:
        * MongoDB Connection Username=kPE # generated
        * MongoDB Connection Password=ORty1DPq6vEUcGVb # generated
        * MongoDB Database Name=sampledb
        * MongoDB Admin Password=g8qDeQOKBVC6fNuk # generated
        * Replica Set Name=rs0
        * Keyfile Content=6QAogrEVu25bXXeHlkSrawbCwAnJVPdbfTBpqIePoWVhlugBGnPwj6fmeiX7EHhm3Yf861Qc0IccUTdnA4hXDURvtik73ob2q8gH5nWCI0EdMqVFgAXDq0DaDVH8CbnUNUiAVnu5FPKsxVo8c8hf4l7jBngpMVOOTKA0wXc0lYai7ERH53UQjVIT8556WuyS4tNGH5Pm2sxYewvAfqsKc53xM1blEKQI7BHEPgaYFfFHXhgufFrqhtOiqd4FlB2 # generated
        * MongoDB Docker Image=centos/mongodb-32-centos7
        * OpenShift Service Name=mongodb
        * Volume Capacity=1Gi
        * Memory Limit=512Mi

error: object does not implement the Object interfaces

Why it will not take these volumes? Do you have any suggestions?

error: object does not implement the Object interfaces

@Tiboris This looks like a different issue. Could you create it for origin repo?

@php-coder #13615 Done.

How do I create volumes without hostPath with hope it will get bound with claims?

Any other type of volume should do the trick, I hope. I've talked with people behind StatefulSet last week during KubeCon EU and they said that supporting host volume is on the road map.

I was successfully using PetSets with hostPath a couple month ago, so I would be surprised if now, with StatefulSets it doesn't work anymore.

I'm surprised the same you are @php-coder, I remember playing with hostPath and PetSets myself, as well. But if none of the security constraints are the problem I'm not quite sure what else might it be.

for volume mount, did you create your secret and link this secret to your default serviceaccount --for-mount?

I try to verify that anyuid have to bind to serviceaccount of your project default to run the container-init. Still need more investigation and evidence on this.

Hi @dannyaxa,

I have only added Security Context Constraint to my service account with:

oc login -u system:admin 
oc create serviceaccount useroot
oc adm policy add-scc-to-user anyuid -z useroot

Maybe use local-volume: https://github.com/kubernetes-incubator/external-storage/tree/master/local-volume#option-3-baremetal-environments instead? It is designed to work with StatefulSet.

I have exactly the same issue with much more recent openshift :

oc v3.7.2+282e43f
kubernetes v1.7.6+a08f5eeb62
openshift v3.7.2+26304a3-2

Trying to deploy zookeeper and kubernetes statefulsets :

$ oc get pod,pvc
NAME             READY     STATUS              RESTARTS   AGE
po/kafka-0       0/1       ContainerCreating   0          23m
po/zookeeper-0   0/1       ContainerCreating   0          23m
NAME                                STATUS    VOLUME    CAPACITY   ACCESSMODES   STORAGECLASS   AGE
pvc/kafka-storage-kafka-0           Pending                                                     23m
pvc/zookeeper-storage-zookeeper-0   Pending                                                     23m

$ oc get event | grep Failed
21m        21m         1         zookeeper-0                     Pod                                 Warning   FailedMount             kubelet, localhost            Unable to mount volumes for pod "zookeeper-0_myproject(d11b0889-2d0c-11e8-b21a-847beb0ecd6b)": timeout expired waiting for volumes to attach/mount for pod "myproject"/"zookeeper-0". list of unattached/unmounted volumes=[zookeeper-storage zookeeper-metrics-config default-token-ztxdx]
24m        39m         63        zookeeper-storage-zookeeper-0   PersistentVolumeClaim               Normal    FailedBinding           persistentvolume-controller   no persistent volumes available for this claim and no storage class is set

Believing @Tiboris , I have tried adm policy : it didn't chaange anything.

pvc/kafka-storage-kafka-0 Pending

@ledroide, this looks like a different issue. Please open a new one and double check you have a default storage class. Please include oc get storageclass -o yaml and oc describe pvc there.

@Tiboris were you able to resolve this issue or found a workaround ?

Hello @nicolasroger17 I will look into it ASAP :)

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@openshift-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Was this page helpful?
0 / 5 - 0 ratings