Machine: docker daemon fails to start on machine created on aws

Created on 17 Dec 2016  ·  20Comments  ·  Source: docker/machine

summary: starting the docker daemon results in Error starting daemon: error initializing graphdriver: driver not supported

Environment:
docker-machine version 0.8.2, build e18a919

below the steps I carried out:

I created a machine on aws using the command
docker-machine create --driver amazonec2 --amazonec2-region eu-central-1 docker-host-aws-3
resulting in this output:

Running pre-create checks...
Creating machine...
(docker-host-aws-3) Launching instance...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with ubuntu(systemd)...
Installing Docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Error creating machine: Error checking the host: Error checking and/or regenerating the certs: There was an error validating certificates for host "35.156.199.2:2376": dial tcp 35.156.199.2:2376: getsockopt: connection refused
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which will stop running containers.

running docker-machine regenerate-certs docker-host-aws-3
results in:

Regenerating TLS certificates
Waiting for SSH to be available...
Detecting the provisioner...
Installing Docker...
Error getting SSH command to check if the daemon is up: Something went wrong running an SSH command!
command : sudo docker version
err     : exit status 1
output  : Client:
 Version:      1.12.5
 API version:  1.24
 Go version:   go1.6.4
 Git commit:   7392c3b
 Built:        Fri Dec 16 02:36:42 2016
 OS/Arch:      linux/amd64
Cannot connect to the Docker daemon. Is the docker daemon running on this host?

I connected to the machine using
docker-machine ssh docker-host-aws-3
and tried to start the docker damon with this result:

ubuntu@docker-host-aws-3:~$ sudo service docker start
ubuntu@docker-host-aws-3:~$ sudo service docker status
● docker.service
   Loaded: loaded (/etc/systemd/system/docker.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sat 2016-12-17 15:46:13 UTC; 6s ago
  Process: 8401 ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=amazonec2 (code=exited, status=1/FAILURE)
 Main PID: 8401 (code=exited, status=1/FAILURE)

Dec 17 15:46:12 docker-host-aws-3 systemd[1]: Started docker.service.
Dec 17 15:46:12 docker-host-aws-3 docker[8401]: time="2016-12-17T15:46:12.697230388Z" level=info msg="libcontainerd: new containerd process, pid: 8407"
Dec 17 15:46:13 docker-host-aws-3 docker[8401]: time="2016-12-17T15:46:13.730095368Z" level=fatal msg="**Error starting daemon: error initializing graphdriver: driver not supported**"
Dec 17 15:46:13 docker-host-aws-3 systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Dec 17 15:46:13 docker-host-aws-3 systemd[1]: docker.service: Unit entered failed state.
Dec 17 15:46:13 docker-host-aws-3 systemd[1]: docker.service: Failed with result 'exit-code'.

When I last created a new machine on AWS (around 1. December 2016) everything worked OK.

Most helpful comment

As pointed out in the previous link, a temporary workaround is:

docker-machine create 
--driver amazonec2 
--engine-install-url=https://web.archive.org/web/20170623081500/https://get.docker.com

All 20 comments

I had the same issue which was partially fixed by upgrading to the rc version, I'm now having an issue with connections be very slow, see #3931.

I am having the same issue, as well as a teammate. Things used to work nicely until a few weeks ago, but now we are both unable to create and start machines in AWS with docker-machine.

These are some of the infos right now:

--- on my local host/workstation, I've updated my docker today as an attempt to fix the issue:

docker version
Client:
Version: 1.12.5
API version: 1.24
Go version: go1.6.4
Git commit: 7392c3b
Built: Fri Dec 16 02:36:42 2016
OS/Arch: linux/amd64

Server:
Version: 1.12.5
API version: 1.24
Go version: go1.6.4
Git commit: 7392c3b
Built: Fri Dec 16 02:36:42 2016
OS/Arch: linux/amd64

My docker-machine version is:
docker-machine version 0.7.0, build a650a40

Whenever I try to create a machine, I get:

Creating machine in EC2: devtest
Running pre-create checks...
Creating machine...
(devtest) Launching instance...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with ubuntu(systemd)...
Installing Docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Error creating machine: Error running provisioning: Unable to verify the Docker daemon is listening: Maximum number of retries (10) exceeded

The output of docker-machine ls is:

devtest - amazonec2 Running tcp://:2376 Unknown Unable to query docker version: Cannot connect to the docker engine endpoint

I can ssh into the machine with:
docker-machine ssh devtest
Welcome to Ubuntu 15.10 (GNU/Linux 4.2.0-18-generic x86_64)

Your Ubuntu release is not supported anymore.
For upgrade information, please visit:
http://www.ubuntu.com/releaseendoflife

New release '16.04.1 LTS' available.
Run 'do-release-upgrade' to upgrade to it.

ubuntu@devtest:~$


If I try to do sudo service docker restart, then I see the logs:

sudo service docker status -l
● docker.service
Loaded: loaded (/etc/systemd/system/docker.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Mon 2016-12-19 14:51:16 UTC; 7s ago
Process: 4788 ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/server.pem --tlskey /etc/docker/server-key.pem --label provider=amazonec2 (code=exited, status=1/FAILURE)
Main PID: 4788 (code=exited, status=1/FAILURE)

Dec 19 14:51:15 devtest systemd[1]: Stopped docker.service.
Dec 19 14:51:15 devtest systemd[1]: Started docker.service.
Dec 19 14:51:15 devtest docker[4788]: time="2016-12-19T14:51:15.760133024Z" level=info msg="libcontainerd: new containerd process, pid: 4794"
Dec 19 14:51:16 devtest docker[4788]: time="2016-12-19T14:51:16.793374140Z" level=fatal msg="Error starting daemon: error initializi...pported"
Dec 19 14:51:16 devtest systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Dec 19 14:51:16 devtest systemd[1]: docker.service: Unit entered failed state.
Dec 19 14:51:16 devtest systemd[1]: docker.service: Failed with result 'exit-code'.
Hint: Some lines were ellipsized, use -l to show in full.

Any clues?

Just to add some more: the full process works well with a virtualbox machine. In only fails with AWS.

Another data point: the issue seems fixed for me after updating the docker-machine to https://github.com/docker/machine/releases/tag/v0.9.0-rc2

I will double-check with my teammate.

All set for my teammate after upgrading both docker and docker-machine. Thanks.

Hi fredbt thx for the info about your experience with upgrading. But what about #3931 ? Do you run into the same issue when using v0.9.0-rc2 ?

answered in the thread.

On Mon, Dec 19, 2016 at 6:37 PM, dberchtold notifications@github.com
wrote:

Hi fredbt thx for the info about your experience with upgrading. But what
about #3931 https://github.com/docker/machine/issues/3931 ? Do you run
into the same issue when using v0.9.0-rc2 ?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/docker/machine/issues/3930#issuecomment-268027353,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AUIUwprtsqFL06GeZsCnqMJfRErNmLFtks5rJsDFgaJpZM4LP60G
.

I had the same issue, and was able to fix with v0.8.2 by changing the AMI. While the docker-machine documentation says that docker-machine will default to the latest 16.04 LTS AMI, it seems to be using 15.10. I explicitly set the AMI to the latest 16.04 LTS version for my region and it worked. I wonder if v0.8.2 is misconfiguring the startup for Ubuntu 15.10.

Doesn't work:
docker-machine create --driver amazonec2 --amazonec2-region us-west-2

Works:
docker-machine create --driver amazonec2 --amazonec2-region us-west-2 --amazonec2-ami ami-b4a015d4

what works for me:

use docker-machine v0.9.0 RC2
(in region eu-central-1 AMI-ID ami-597c8236 is used by v0.9.0 RC2)

or use v0.8.2 and AMI-ID from above
docker-machine create --driver amazonec2 --amazonec2-region eu-central-1 --amazonec2-ami ami-597c8236 name-of-machine

I think that documentation got published prematurely. The change for using 16.04 by default will land in 0.9.0 (next release, currently in release candidate mode).

looks like the exact same issue came back, Unable to query docker version: Cannot connect to the docker engine endpoint where the only change is an upgrade of the host; which likely changed the default AMI again.
The default AMI in the documentation results to a timeout this time around though

debugging attempts:
$ docker-machine create --driver amazonec2 --amazonec2-region eu-west-1 test Running pre-create checks... Creating machine... (test) Launching instance... Waiting for machine to be running, this may take a few minutes... Detecting operating system of created instance... Waiting for SSH to be available... Detecting the provisioner... Provisioning with ubuntu(systemd)... Installing Docker... Copying certs to the local machine directory... Copying certs to the remote machine... Setting Docker configuration on the remote daemon... Error creating machine: Error running provisioning: ssh command error: command : sudo systemctl -f start docker err : exit status 1 output : Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journalctl -xe" for details.

```$ systemctl status docker.service
● docker.service - Docker Application Container Engine
Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/docker.service.d
└─10-machine.conf
Active: inactive (dead) (Result: exit-code) since Sat 2017-07-01 14:29:25 UTC; 3min 57s ago
Docs: https://docs.docker.com
Process: 5203 ExecStart=/usr/bin/docker daemon -H tcp://0.0.0.0:2376 -H unix:///var/run/docker.sock --storage-driver aufs --tlsverify --tlscacert /etc/docker/ca.pem --tlscert /etc/docker/
Main PID: 5203 (code=exited, status=1/FAILURE)

Jul 01 14:29:25 test systemd[1]: docker.service: Main process exited, code=exited, status=1/FAILURE
Jul 01 14:29:25 test systemd[1]: Failed to start Docker Application Container Engine.
Jul 01 14:29:25 test systemd[1]: docker.service: Unit entered failed state.
Jul 01 14:29:25 test systemd[1]: docker.service: Failed with result 'exit-code'.
Jul 01 14:29:25 test systemd[1]: docker.service: Service hold-off time over, scheduling restart.
Jul 01 14:29:25 test systemd[1]: Stopped Docker Application Container Engine.
Jul 01 14:29:25 test systemd[1]: docker.service: Start request repeated too quickly.
Jul 01 14:29:25 test systemd[1]: Failed to start Docker Application Container Engine.```

```$ journalctl -xe

-- Unit UNIT has finished starting up.

-- The start-up result is done.
Jul 01 14:32:28 test systemd[5213]: Reached target Basic System.
-- Subject: Unit UNIT has finished start-up
-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- Unit UNIT has finished starting up.

-- The start-up result is done.
Jul 01 14:32:28 test systemd[5213]: Reached target Default.
-- Subject: Unit UNIT has finished start-up
-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- Unit UNIT has finished starting up.

-- The start-up result is done.
Jul 01 14:32:28 test systemd[5213]: Startup finished in 21ms.
-- Subject: System start-up is now complete
-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- All system services necessary queued for starting at boot have been
-- successfully started. Note that this does not mean that the machine is

-- now idle as services might still be busy with completing start-up.

-- Kernel start-up required KERNEL_USEC microseconds.

-- Initial RAM disk start-up required INITRD_USEC microseconds.

-- Userspace start-up required 21232 microseconds.
Jul 01 14:32:28 test systemd[1]: Started User Manager for UID 1000.
-- Subject: Unit [email protected] has finished start-up
-- Defined-By: systemd

-- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel

-- Unit [email protected] has finished starting up.

-- The start-up result is done.```

The Problem here is the new docker version. Have a look at this post:
https://forums.docker.com/t/docker-machine-provisioned-aws-instance-can-not-start-docker-engine/34200

It solved my problems with docker-machine.

As pointed out in the previous link, a temporary workaround is:

docker-machine create 
--driver amazonec2 
--engine-install-url=https://web.archive.org/web/20170623081500/https://get.docker.com

THX @juliaaano works also with Google Cloud Platform (same issue with docker-machine)

+1 same issue, workaround works, thanks

This also affects the Azure driver. The above comment works as well.

+1 same issue, workaround works with de digitalocean, thanks a lot!

+1 helped me to start on AWS. But only if I'm using t2 type instances. It is still failing on m3 type using the same configuration. Any help will be appreciated.

Was this page helpful?
0 / 5 - 0 ratings