Kind: Official Helm chart 'mongodb-replicaset' fails in KinD due to PersistentStorage issues (dir write)

Created on 17 Jun 2019 · 11Comments · Source: kubernetes-sigs/kind

What happened:
Installation of official https://github.com/helm/charts/tree/master/stable/mongodb-replicaset Helm chart fails due to storage-related issues. Note that Persistent Volumes are used.

What you expected to happen:
mongodb-replicaset Helm chart should work out of the box, as it works in production or in minikube (incl. docker driver).

How to reproduce it (as minimally and precisely as possible):

Install & Init Helm

1) Download Helm
https://github.com/helm/helm/releases

2) Init Helm

kubectl --namespace kube-system create serviceaccount tiller<br>
kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
helm init --service-account tiller --upgrade

Deploy MongoDB-replicaset chart:

$ helm repo add stable https://kubernetes-charts.storage.googleapis.com/
$ helm install --name test stable/mongodb-replicaset

$ kubectl get pods -l release=test
NAME                        READY   STATUS                  RESTARTS   AGE
test-mongodb-replicaset-0   0/1     Init:CrashLoopBackOff   23         16h

Reproducible anywhere: locally in KinD, cloud server, CircleCI and TravisCI so it's pretty consistent and not related to specific environment.

Anything else we need to know?:

MongoDB errors with the following message: Attempted to create a lock file on a read-only directory: /data/db: terminating.

$ kubectl logs test-mongodb-replicaset-0 --all-containers

[2019-06-16T14:32:42,611715811+00:00] [on-start.sh] Peers: test-mongodb-replicaset-0.test-mongodb-replicaset.default.svc.cluster.local
[2019-06-16T14:32:42,613603470+00:00] [on-start.sh] Starting a MongoDB replica
[2019-06-16T14:32:42,615620128+00:00] [on-start.sh] Waiting for MongoDB to be ready...
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] MongoDB starting : pid=34 port=27017 dbpath=/data/db 64-bit host=test-mongodb-replicaset-0
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] db version v3.6.13
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] git version: db3c76679b7a3d9b443a0e1b3e45ed02b88c539f
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] allocator: tcmalloc
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] modules: none
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] build environment:
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten]     distmod: ubuntu1604
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten]     distarch: x86_64
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten]     target_arch: x86_64
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] options: { config: "/data/configdb/mongod.conf", net: { bindIp: "0.0.0.0", port: 27017 }, replication: { replSet: "rs0" }, storage: { dbPath: "/data/db" } }
2019-06-16T14:32:42.632+0000 I STORAGE  [initandlisten] exception in initAndListen: IllegalOperation: Attempted to create a lock file on a read-only directory: /data/db, terminating
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] now exiting
2019-06-16T14:32:42.632+0000 I CONTROL  [initandlisten] shutting down with code:100
exception: connect failed
[2019-06-16T14:32:43,671556218+00:00] [on-start.sh] mongod shutdown unexpectedly
[2019-06-16T14:32:43,673334996+00:00] [on-start.sh] Shutting down MongoDB (force: true)...
MongoDB shell version v3.6.13
connecting to: mongodb://localhost:27017/admin?gssapiServiceName=mongodb
2019-06-16T14:32:43.711+0000 W NETWORK  [thread1] Failed to connect to 127.0.0.1:27017, in(checking socket for error after poll), reason: Connection refused
2019-06-16T14:32:43.711+0000 E QUERY    [thread1] Error: couldn't connect to server localhost:27017, connection attempt failed :
connect@src/mongo/shell/mongo.js:263:13
@(connect):1:6
exception: connect failed
[2019-06-16T14:32:43,715480656+00:00] [on-start.sh] db.shutdownServer() failed, sending the terminate signal
/init/on-start.sh: line 76: kill: (35) - No such process
, err: exit status 1
Error from server (BadRequest): container "mongodb-replicaset" in pod "test-mongodb-replicaset-0" is waiting to start: PodInitializing

Here is the /data/db (datadir) from Helm chart VolumeClaim template: https://github.com/helm/charts/blob/0dc8468615aa0b23795e07178dfb330fe466e532/stable/mongodb-replicaset/templates/mongodb-statefulset.yaml#L339-L360. I see no reason yet why it would be read-only dir in specific KinD environment.

If installing same Helm chart with Persistence disabled, - it's deployed/works OK:

$ helm install --name test stable/mongodb-replicaset --set persistentVolume.enabled=false

Environment:

kind version: v0.3.0 (tried with latest dev as well)
Kubernetes version: v1.14.2
Docker version: 18.09.6
OS: Ubuntu xenial

kinbug

Source

armab

All 11 comments

118 tracks intention to switch at some point once we have a good portable solution (we need to do multi-arch), for an immediate work around using https://github.com/rancher/local-path-provisioner may solve your issue (see #118 comments for more).

BenTheElder on 18 Jun 2019

I've updated first message to include Helm installation.

Thanks for the pointers about hostPath limitations, it explains everything now:

the directories created on the underlying hosts are only writable by root. You either need to run your process as root in a privileged container or modify the file permissions on the host to be able to write to a hostPath volume

armab on 20 Jun 2019

👍1

I believe rancher/local-path-provisioner doesn't have this problem (it modifies the directory) but isn't quite ready to ship with kind by default, deploying it and replacing the default storage class with that might resolve your issue.

We may follow up on that or ship something similar...

BenTheElder on 20 Jun 2019

I've tried rancher/local-path-provisioner and confirming it works.

As @yasker mentioned in https://github.com/kubernetes-sigs/kind/issues/118#issuecomment-467222985 Minikube's own provisioner for hostpath sets 0777 on volume dir to workaround this.

Seems rancher/local-path-provisioner does similar thing: https://github.com/rancher/local-path-provisioner/blob/562d008d04df0f08cfd17d9f2c6f3acd517a2e54/provisioner.go#L187-L191

That's another +1 to switch to rancher/local-path-provisioner or other provisioner, per https://github.com/kubernetes-sigs/kind/issues/118 discussion.

armab on 20 Jun 2019

👍1

Thanks!

If we were to switch first we basically we need to ensure:

That it can be bundled to work offline (by default it uses unnecessary additional images like alpine, I think you can configure this now per previous suggestion)
It needs to support ARM/PPC images (last I checked it only had amd64) otherwise it would be the first thing we introduce to actively break that, there is some kind CI against ARM currently

Alternatively it's not very complicated and doesn't do / support much, I think it would be possible to go even simpler and lighter if we need, similar to what minikube did (though not literally, as it needs to handle multi-node, and also IIRC the minikube one was/is not conformant wrt permissions)

BenTheElder on 20 Jun 2019

@armab is ok to close?

aojea on 5 Aug 2019

@aojea It's not OK to close. Issue wasn't fixed in core and KinD didn't transition yet to provisioner that doesn't have limitations described.

I think it's both interest of community and KinD maintainers so the tool works by default as close as possible to native K8s configuration so users can run/test their deployments in automated CI manner and more like local environment.

armab on 6 Aug 2019

I'm not sure that statement about that native K8s has a local provisioner is correct, actually, conformance testing doesn't need it and people may want to use other provisioners like NFS or Ceph.
Also, you can always install your preferred local provisioner, metric server, CNI plugin, ... once kind finish the installation is just adding an extra step in your CI.

However, seems that not having it breaks a lot of uses cases, let's keep it open and thank you for your feedback @armab

aojea on 6 Aug 2019

We _are_ using a native and built in (to kubernetes) storage driver here for the moment. this limitation is not one created by kind, and a custom storage class or alternative default storage class can be used.

This particular issue is essentially a duplicate of https://github.com/kubernetes-sigs/kind/issues/118, with the addition of desiring change in the Kubernetes built in storage driver.

I do think we want https://github.com/kubernetes-sigs/kind/issues/118, but that has it's own issue.

Note that the minikube driver has historically _not_ been conformant and set more permissive storage than specified. I am not sure if that is still the case. IIRC it is also a custom driver, but single node specific.

This part:

the directories created on the underlying hosts are only writable by root. You either need to run your process as root in a privileged container or modify the file permissions on the host to be able to write to a hostPath volume

Is a core Kubernetes issue, I don't think that's likely to change, but if we want it to we need to track it for Kubernetes core. There's no way to change that here.

BenTheElder on 12 Aug 2019

Please understand that we wholly intend to deliver a work around for this, but:

From Kubernetes's POV, hostPath should require root, because hostPath is a privileged thing to give a pod access to. I don't think this is likely to change, but if it is, it will need to be tracked upstream.
118 already tracks alternative drivers with more context.

Given these, I am going to ask that we use #118 to discuss / track what kind can do about storage, and the Kubernetes upstream issue tracker if you want to track issues there.

BenTheElder on 12 Aug 2019

Was this page helpful?

0 / 5 - 0 ratings