Rancher: convoy-gluster service won't run on docker 1.11

Created on 14 Apr 2016  Â·  49Comments  Â·  Source: rancher/rancher

Version - v1.0.1-rc7
Docker version 1.11

Steps to Reproduce:

  1. Go to digital ocean and create a droplet
  2. Follow Custom directions to add host to rancher UI
  3. Go through UI to add a couple more DO hosts
  4. Setup gluster FS catalog
  5. Setup Convoy GlusterFS catalog

Results: The convoy-gluster service runs on all hosts except the custom one. Error saying:
4/14/2016 1:38:19 PMWaiting for metadata
4/14/2016 1:38:19 PMtime="2016-04-14T20:38:19Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"

Expected: Service to run

arecatalog arestorage kinbug

Most helpful comment

@ibuildthecloud IIUC share-mnt was a hack around a Docker limitation which is gone since 1.10, any plans (or WIP) to rewrite the parts of convoy-agent which use share-mnt?

I'm aware that Rancher isn't supported on 1.11, but 1.11 has been out there for a month now and downgrading to 1.10 is a PITA on some platforms, most notably on Docker Toolbox on a Mac (May Online Meetup is about "Rancher on a Laptop", you know ;))

All 49 comments

Got the same issue,

I tried with different host OS (Ubuntu and Debian) as well as the DigitalOcean Docker Image, all fail at starting convoy-gluster. All custom DO hosts fail.

@tfiduccia @Alexander912 what version of docker do you have? 1.11?

Thanks @alena1108... I was having the same problem with convoy-nsf and once I downgraded the hosts to Docker 1.10.3 it started working as expected. Thanks again.

@tfiduccia @kjohns3 yea, Rancher officially supports 1.10.3 for v1.0.0 and v1.0.1.

@deniseschannon ^^

I also ran into this on convoy-nfs with docker 1.11.0, ubuntu 14.04. Confirm pinning to 1.10.3 fixed issue. For future googlers, you can use the following commands with apt:

sudo apt-mark hold docker-engine
sudo apt-get install docker-engine=1.10.3-0~trusty

Problem solved, 1.10.3 works :)

And while investigating my setup yesterday, I guess I've detected the problem:

convoy_gluster containers are looking for /host/var/run/docker/execdriver/native, which is located at var/run/docker/execdriver/native on the host.

Docker versions older than 1.11 are using that directory to store data for running containers, docker 1.11 also does, but they changed the driver, so there is no var/run/docker/execdriver/native directory anymore, no it is called /var/run/docker/libcontainerd.

thanks for the tip @wazoo :+1:

@Alexander912 was right. Have 1.10.3 and still got the problem. I created this directory (var/run/docker/execdriver/native), now it works

Works fine for me on Docker 1.10.3, fails as described above on 1.11. If I create the (empty) dir as suggested by ramden, the container fails to start with the following error:

level=fatal msg="Failed to find state.json"

What can I do to help fix this issue? I successfully built convoy-agent from source, tag v0.3.0 works fine with Docker 1.10.3 (the latest Rancher build uses that tag).

When using the latest code from the convoy-agent repo on Docker 1.10.3, I'm getting

level=error msg="BUG: glusterfs.GlusterFSVolume doesn't have required field Name"

On Docker 1.11, it fails exactly like the original container.

Looks like share-mnt fetched in this line is causing the error:

root@21cc390e3f78:/# share-mnt
FATA[0000] open /host/var/run/docker/execdriver/native: no such file or directory

Here it is:

driverRun          = "/var/run/docker/execdriver/native"
containerDriverRun = "/host" + driverRun

@ibuildthecloud IIUC share-mnt was a hack around a Docker limitation which is gone since 1.10, any plans (or WIP) to rewrite the parts of convoy-agent which use share-mnt?

I'm aware that Rancher isn't supported on 1.11, but 1.11 has been out there for a month now and downgrading to 1.10 is a PITA on some platforms, most notably on Docker Toolbox on a Mac (May Online Meetup is about "Rancher on a Laptop", you know ;))

Just hit this wall....Had to downgrade, which was kind of a pain.

Same here. Any news on an upgrade to convoy?

Not surprisingly, this applies as well to the convoy-nfs catalog template which basically is the rancher/convoy-agent:v0.7.0 image with just a different command volume-agent-nfs.

Looking forward to 1.1.0 to be released.

@fuhbar I'm quite sure 1.1.0 won't fix this.

yea, Rancher officially supports 1.10.3 for v1.0.0 and v1.0.1.

@alena1108 when I add hosts in Rancher (with google driver), the instances are created with Docker 1.11.1 on it ! Why do Rancher install this version if it supports only 1.10.3 ?

Then I have the error "level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory" as reported here...

How do I can specify to Rancher to install the 1.10.3 version ?

@cjellick any update on replacing the share-mnt hack with something that works on newer Docker versions? Do you need help with this, is there any work in progress?

I also tried updating the Convoy Gluster catalog entry to use the 0.7.0 client - but this did not make difference.

I see the latest Rancher release has support for Docker 1.1. - but not for this issue.

Any ETA on a fix ?

we've just hit this issue as well - a real pita. If 1.11 is supported, then surely this bug should have been fixed by now ?

There is some WIP related to 1.11 at https://github.com/rancher/runc/tree/jazzhands from @ibuildthecloud, but it doesn't work for me yet (/var/run/runc/ is missing).

I tested it today with fresh, upgraded Ubuntu 16.04 on OVH Public Cloud with Docker:

Client:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   b9f10c9
 Built:        Wed Jun  1 22:00:43 2016
 OS/Arch:      linux/amd64

Server:
 Version:      1.11.2
 API version:  1.23
 Go version:   go1.5.4
 Git commit:   b9f10c9
 Built:        Wed Jun  1 22:00:43 2016
 OS/Arch:      linux/amd64

I added host in rancher interface and it can't activate properly, a lot of this errors in convoy-gluster container, probably it's a matter of network connection.

6/25/2016 7:10:39 PMtime="2016-06-25T17:10:39Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"
6/25/2016 7:10:54 PMWaiting for metadata
6/25/2016 7:10:54 PMtime="2016-06-25T17:10:54Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"
6/25/2016 7:11:10 PMWaiting for metadata
6/25/2016 7:11:10 PMtime="2016-06-25T17:11:10Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"
6/25/2016 7:11:29 PMWaiting for metadata
6/25/2016 7:11:29 PMtime="2016-06-25T17:11:29Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"
6/25/2016 7:11:44 PMWaiting for metadata
6/25/2016 7:11:44 PMtime="2016-06-25T17:11:44Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"
6/25/2016 7:11:54 PMWaiting for metadata
6/25/2016 7:11:54 PMtime="2016-06-25T17:11:54Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"
6/25/2016 7:12:09 PMWaiting for metadata
6/25/2016 7:12:09 PMtime="2016-06-25T17:12:09Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"

Both convoy for Gluster and NFS are not working on Docker version 1.9, 1.10 and 1.11. I have tried all versions and didn't able to make it working. Please fix this bug duly.

Anymore details on how they are not working for your @biolounge? Specifics such as OS flavor and version and logs of the failures would be helpful.

Particularly for convoy-nfs, we verified/certified it worked on docker 1.10.x and are currently working on the fix to allow it to work on 1.11+.

@biolounge, convoy-gluster definitely does work on 1.10. Make sure 1.11 is not part of your setup (especially if you're using Docker Toolbox).

@SystemZ, I explained above which part of code causes the error you saw in your container, so I don't get why you think it's a matter of network connection.

@rsippl Yea, my fault, thank you for pointing that out.

I think there is no 1.10.3 release for xenial (16.04 LTS), would it be possible to add it?
I'm using the binaries as a workaround atm

edit: actually, why am I asking this here? :-1:
but: I think convoy-nfs got an update for docker 1.11, I don't see anything here happening soon, or is there a dev branch I can checkout and test with?

I can up the Gluster FS and Convoy glusterfs but unable to start the container with volume_driver: convoy.

Does anyone get the same issue with me?

how did you name convoy? Try convoy-glusterfs as a volume driver

I use the default name of convol. Here is my compose file

`
ghost:

image: ghost:0.7.1

ports:

    - 80:2368

volume_driver: convoy

volumes:

    - my_vol:/var/lib/ghost

labels:

    io.rancher.scheduler.affinity:host_label: server.scraping.jobs.ghost=true

`

You should use
volume_driver: convoy-gluster

2016-07-18 17:07 GMT+06:00 Huy Tran [email protected]:

I use the default name of convol. Here is my compose file

ghost:
image: ghost:0.7.1
ports:

  • 80:2368
    volume_driver: convoy
    volumes:
  • my_vol:/var/lib/ghost
    labels:
    io.rancher.scheduler.affinity:host_label: server.scraping.jobs.ghost=true

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/rancher/rancher/issues/4411#issuecomment-233300978,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABB47yhsfncBqxBy2KF13GgXqHDEdVlOks5qW15ngaJpZM4IHwjR
.

@clicman Just tried it. The result was the same, the container keeps starting status all the time.

I follow the instructor here: https://www.youtube.com/watch?v=LESPaJ9_DHE

@tranquochuy It is just for clear. I have convoy-gluster on 1.10 and it works with volume_driver: convoy-gluster
So I think it's outdated manual on youtube.

@clicman many thanks. I got it up. Look like i messed up with the driver name. I used convoy-glusterfs instead of convoy-gluster

Any update on this ?

I would like an update on this too! Have anyone gotten it to work?

Rancher 1.1.2 on Docker 1.12.1 gives

Waiting for metadata
time="2016-09-05T13:02:20Z" level=fatal msg="open /host/var/run/docker/execdriver/native: no such file or directory"

Is anybody working on this issue?

Crickets.... any update? This is a very important feature

convoy-gluster issue. Can we bump this out of 1.2.0?

FYI, I gave up on Rancher a couple of weeks ago because of how this issue was (not) dealt with.

I think Gluster support was deprecated entirely. :(

FYI, I gave up on Rancher a couple of weeks ago because of how this issue was (not) dealt with.

Me too!
Building features is good, solving show stoppers is better.

So what is a better alternative than Rancher right now?

-- 
Aleksander Hansson
Sent with Airmail

On October 3, 2016 at 1:46:45 PM, Michael ([email protected]) wrote:

FYI, I gave up on Rancher a couple of weeks ago because of how this issue was (not) dealt with.

Me too!
Building features is good, solving show stoppers is better.

—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.

We ended up just installing glusterfs directly, and exposing the volumes to all containers it wasn't difficult

So you run GlusterFS directly on each host, like you would if you were setting up traditional NFS?

We have removed GlusterFS and Convoy Gluster from the catalog as users were expecting a robust tool as an alternative persistent storage for Docker volumes. However, due to lack of active maintenance, we cannot recommend this solution going forward.

Instead, we recommend and certify Convoy NFS, which is actively maintained by Rancher. As a user, you can get GlusterFS support directly from Red Hat and use it in Rancher using Rancher's NFS plugin.

Due to these changes regarding glusterfs and convoy gluster, we will not be addressing this bug for 1.2.0

FYI, I gave up on Rancher a couple of weeks ago because of how this issue was (not) dealt with.
Me too!

Building features is good, solving show stoppers is better.

I too had to give up on Rancher because the environment seemed to decay and get into an unrepairable state. Gave up after rebuilding my cluster 3 times.

Moved to kubernetes and it has been rock-solid. Cluster health graphs (eg. disk, memory, cpu, etc) over MONTHS has been pleasantly stable.
Kubernetes was a pain to setup though, compared to Rancher.

I'll probably try Rancher again in 12 months and see how it's coming along.

I hope the Rancher engineers forward this feedback to their upper-management. The company is releasing code that isn't yet stable; they are moving too fast. They need to spend time getting the brittleness out.

Closing issue as Convoy-Gluster is no longer being maintained.

Was this page helpful?
0 / 5 - 0 ratings