Machine: Creating new machine ends with `tls: bad certificate` error

Created on 24 Oct 2016 · 14Comments · Source: docker/machine

Problem / Error log

docker-machine create --driver digitalocean --digitalocean-access-token XXX --digitalocean-image=coreos-beta --digitalocean-ssh-user=core docker-sandbox
Running pre-create checks...
Creating machine...
(docker-sandbox) Creating SSH key...
(docker-sandbox) Creating Digital Ocean droplet...
(docker-sandbox) Waiting for IP address to be assigned to the Droplet...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with coreOS...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Error creating machine: Error checking the host: Error checking and/or regenerating the certs: There was an error validating certificates for host "45.55.128.252:2376": remote error: tls: bad certificate
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which will stop running containers.

docker-machine -v

docker-machine version 0.8.2, build e18a919

docker -v

Docker version 1.12.2, build bb80604

Source

pnull

👍4

Most helpful comment

For anyone getting this error it turned out to be a very obscure problem for me:

When docker machine initially creates its ca.pem, it uses the USER (or USERNAME if windows) environment variable as the issuer. In my case, docker-machine was running via my Gitlab CI Multi Runner which starts up via an init script automatically. Because init scripts don't have the USER env var defined, the cert's Issuer was "unknown" and any machines created with this cert would fail validation. Removing the cert and manually running the commands in a shell worked just fine, but because this is my CI environment this was not a sustainable solution. Setting the env var for the init script solved the problem for me, but I suppose you could also bake in a valid CA cert into your CI image as well.

You can verify that this is your problem with openssl x509 -in ~/.docker/machines/certs/ca.pem -text -noout and checking the Issuer. It should be <username>.<hostname>

rifelpet on 16 Feb 2017

👍13

All 14 comments

I am also facing this issue.
I have installed same docker-machine version you commented in both my laptop (Mac) and a digitalocean droplet (running a Ubuntu 16.04 with kernel: 4.4.0-57-generic). I am able to create a(nother) digitalocean machine from the Mac but I am getting that bad certificate error when trying to run it from the droplet. Same command, but different behaviours.

However, I can ssh into the machine by using docker-machine ssh <machine_name> without any issue. So certificates do not seem to be wrong.

Not sure if this can be useful:
Droplet openssh version: OpenSSH_7.2p2 Ubuntu-4ubuntu2.1, OpenSSL 1.0.2g 1 Mar 2016
Mac openssh version: OpenSSH_7.2p2, LibreSSL 2.4.1

oscar-martin on 9 Jan 2017

same error for me too....my full stack trace when installing the machine is:

(default) Creating VirtualBox VM...
(default) Creating SSH key...
(default) Starting the VM...
(default) Check network to re-create if needed...
(default) Waiting for an IP...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with boot2docker...
Copying certs to the local machine directory...
Error creating machine: Error running provisioning: error generating server cert: tls: failed to find any PEM data in certificate input
Looks like something went wrong in step ´Checking if machine default exists´... Press any key to continue...

Trying to reinstall certificate is also not working with the following error:

$ docker-machine regenerate-certs default
Regenerate TLS machine certs?  Warning: this is irreversible. (y/n): y
Regenerating TLS certificates
Waiting for SSH to be available...
Detecting the provisioner...
Copying certs to the local machine directory...
error generating server cert: tls: failed to find any PEM data in certificate input

ssh to machine is working though

$ docker-machine -v
docker-machine.exe version 0.8.2, build e18a919

ram@Ram MINGW64 ~
$ docker -v
Docker version 1.12.5, build 7392c3b

rmohan80 on 10 Jan 2017

If you install docker-machine first time then you do not have in that host a self-signed CA that will be used to generate your client certificate and as many server certificates as machines you generate later on. That CA is generated when you try to create a machine if that CA is not yet created. So if you try to generate several servers in parallel (by means of an script), then you’ll generate as many self-signed (root) CA as docker createcommands, all of them being written in the same location that seems to be messing up the environment e.g. spreading out different ca.pem to the remote machines that do match the final version, causing the cert.pem (host identity) to be signed by a former ca.pem which no longer exist… or whatever other abnormal situation.

To fix it, first of all you'll need to delete your existing self-signed CA. This can be done by removing the folder ~/.docker/machine/certs (NOTE: Note this will force the creation of a new self-signed CA for docker-machine to use and will yield your existing machines to fail connecting to the daemon). This will make your docker-machine to generate valid certificates again. Then, for my use case I am creating the first machine in foreground and all the rest of them are done in parallel. That will cause the creation of one root self-signed CA in isolation and then will be used for further docker-machine create commands. It worked like a charm!

The reason why I was able to ssh to the host is because there are a different pair of keys for sshing generate per host that was not bitten by this.

oscar-martin on 11 Jan 2017

👍13

I have this issue as well. The fix proposed by @oscar-martin doesn't work for me. The ssh command does work.

My versions

# docker-machine -v
docker-machine version 0.9.0-rc2, build 7b19591

# docker -v
Docker version 1.12.5, build 7392c3b



md5-00f25dbb6a0b0249c4294bc41ec6bf03



# docker-machine create --driver digitalocean --digitalocean-access-token xxx --digitalocean-image=coreos-stable --digitalocean-ssh-user=core --digitalocean-region ams2 docker-test1
Running pre-create checks...
Creating machine...
(docker-test1) Creating SSH key...
(docker-test1) Creating Digital Ocean droplet...
(docker-test1) Waiting for IP address to be assigned to the Droplet...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with coreOS...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Error creating machine: Error checking the host: Error checking and/or regenerating the certs: There was an error validating certificates for host "146.185.x.x:2376": remote error: tls: bad certificate
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which might stop running containers.



md5-500c300b001f357576ac30125ca538ad



# docker-machine regenerate-certs docker-test1
Regenerate TLS machine certs?  Warning: this is irreversible. (y/n): y
Regenerating TLS certificates
Waiting for SSH to be available...
Detecting the provisioner...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...



md5-fed0a383a8992fbe3f47fc12e7a481c7



# docker-machine ls
NAME           ACTIVE   DRIVER         STATE     URL                         SWARM   DOCKER    ERRORS
docker-test1   -        digitalocean   Running   tcp://146.185.x.x:2376           Unknown   Unable to query docker version: Get https://146.185.x.x:2376/v1.15/version: remote error: tls: bad certificate

This is running from a Digital Ocean droplet with 'Ubuntu Docker 1.12.5 on 16.04'.

lekkerduidelijk on 17 Jan 2017

👍2

Hi @lekkerduidelijk

I think your self-signed CA could be messed up for some reason. Can you please mv the folder certs (~/.docker/machine/certs) to something else and try creating a machine again? Note this will force the creation of a new self-signed CA for docker-machine to use and will yield your existing machines to fail connecting to the daemon.

oscar-martin on 17 Jan 2017

🎉1 👍1

Thanks for your reply.

So I did:
~/.docker/machine# mv certs certsold

and then:

# docker-machine create --driver digitalocean --digitalocean-access-token xxx --digitalocean-image=coreos-stable --digitalocean-ssh-user=core docker-test4
Creating CA: /root/.docker/machine/certs/ca.pem
Creating client certificate: /root/.docker/machine/certs/cert.pem
Running pre-create checks...
Creating machine...
(docker-test4) Creating SSH key...
(docker-test4) Creating Digital Ocean droplet...
(docker-test4) Waiting for IP address to be assigned to the Droplet...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with coreOS...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Docker is up and running!
To see how to connect your Docker Client to the Docker Engine running on this virtual machine, run: docker-machine env docker-test4

And it seems up!

# docker-machine ls
NAME           ACTIVE   DRIVER         STATE     URL                        SWARM   DOCKER    ERRORS
docker-test4   -        digitalocean   Running   tcp://138.197.x.x:2376           v1.12.3

Thanks a lot @oscar-martin!

lekkerduidelijk on 17 Jan 2017

👍1

You're welcome. I edited my response to clearly state the fix to make.

oscar-martin on 17 Jan 2017

For anyone getting this error it turned out to be a very obscure problem for me:

You can verify that this is your problem with openssl x509 -in ~/.docker/machines/certs/ca.pem -text -noout and checking the Issuer. It should be <username>.<hostname>

rifelpet on 16 Feb 2017

👍13

For anyone stumbling on this issue (like I did); there's a good chance it's related to the following issue: https://github.com/docker/machine/issues/3634.
a race condition with generating the certificates, when creating multiple machines simultaneously (only on first docker-machine create run)

jlsjonas on 21 Nov 2017

🎉1

@rifelpet . I ran openssl x509 -in ~/.docker/machines/certs/ca.pem -text -noout as you mentioned and it showed my certs issued by <username> but not <username>.<hostname> does that also cause an issue?

@oscar-martin , I moved my certs file to certs.bak but I'm still getting the cert error upon creating a new machine

$ docker-machine create --driver openstack --openstack-ssh-user ubuntu --openstack-keypair-name "Key" --openstack-private-key-file ~/.ssh/id_rsa --openstack-flavor-id 50 --openstack-image-name "Ubuntu-16.04" staging-manager3
Creating CA: C:\Users\tanner\.docker\machine\certs\ca.pem
Creating client certificate: C:\Users\tanner\.docker\machine\certs\cert.pem
Running pre-create checks...
Creating machine...
(staging-manager3) Creating machine...
Waiting for machine to be running, this may take a few minutes...
Detecting operating system of created instance...
Waiting for SSH to be available...
Detecting the provisioner...
Provisioning with ubuntu(systemd)...
Installing Docker...
Copying certs to the local machine directory...
Copying certs to the remote machine...
Setting Docker configuration on the remote daemon...
Checking connection to Docker...
Error creating machine: Error checking the host: Error checking and/or regenerating the certs: There was an error validating certificates for host "ip:2376": dial tcp ip:2376: i/o timeout
You can attempt to regenerate them using 'docker-machine regenerate-certs [name]'.
Be advised that this will trigger a Docker daemon restart which might stop running containers.

I documented it more in detail here: https://github.com/docker/machine/issues/3829

tannerchung on 7 Feb 2018

I figured out why the docker-machine TLS problems were happening to me. My hosting service's Open Stack Security Group rules required that I open port 2376 TCP Ingress..

rookie mistake, sorry for the extra noise!

tannerchung on 8 Feb 2018

Just as an update, on Windows 10 Home, I needed to delete everything in ~/.docker/machines/certs/* and recreate the images using
docker-machine rm default
docker-machine create default

Marviel on 22 Apr 2019

I've found the reason TLS error happen to me:
The clock of vm and my windows box is not synchronized, there're about 10 hours difference.
After docker-machine generates a cert in my windows box, it will copy it to vm. But for vm, this cert will valid after 10 hours - the time my windows box generated it.
That's why TLS error happens here: docker engine think the cert file is not valid yet.
The solution for this is simple: Just install ntp to synchronize the time of vm and windows box.

You can check this by:
journalctl -xeu docker
and you will see some line like this:
dockerd[362]: http: TLS handshake error from 192.168.1.101:64563: tls: failed to verify client's certificate: x509: certificate has expired or is not yet valid

Hope this will help some one face the same situation.

zhulp on 31 Jul 2019

👍2

Based on @zhulp 's answer, after running:
docker-machine ssh default "sudo date -u $(date -u +%m%d%H%M%Y)"
docker-machine regenerate-certs default
works.