While creating a Hyper-V VM on a Windows 10 machine the docker-machine gets stuck on creating a SSH session to that VM.
I have enclosed a file ([createCommand.txt (https://github.com/docker/machine/files/908025/createCommand.txt)) with the messages from executing the following command
docker-machine -debug create -d hyperv --hyperv-virtual-switch ExternalWireless worker1
Everything looks fine until the script tries to open a SSH session to the newly created VM. Each attempt looks like this
Waiting for SSH to be available...
Getting to WaitForSSH function...
(worker1) Calling .GetSSHHostname
(worker1) DBG | [executing ==>] : C:\WINDOWS\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive ( Get-VM worker1 ).state
(worker1) DBG | [stdout =====>] : Running
(worker1) DBG |
(worker1) DBG | [stderr =====>] :
(worker1) DBG | [executing ==>] : C:\WINDOWS\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive (( Get-VM worker1 ).networkadapters[0]).ipaddresses[0]
(worker1) DBG | [stdout =====>] : 192.168.0.108
(worker1) DBG |
(worker1) DBG | [stderr =====>] :
(worker1) Calling .GetSSHPort
(worker1) Calling .GetSSHKeyPath
(worker1) Calling .GetSSHKeyPath
(worker1) Calling .GetSSHUsername
Using SSH client type: external
&{[-F /dev/null -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none [email protected] -o IdentitiesOnly=yes -i C:\Users\wojtek\.docker\machine\machines\worker1\id_rsa -p 22] C:\Program Files\OpenSSH\ssh.exe <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: exit status 255:
Error getting ssh command 'exit 0' : ssh command error:
command : exit 0
err : exit status 255
output :
and after 60 such attempts the script quits with the message
Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded
notifying bugsnag: [Error creating machine: Error detecting OS: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded]
The VM is created, it is given an IP address, and it starts. The interesting part is that I can create SSH session from the command line using exactly the same SSH commands as in the script (I have enclosed a file sshConnection.txt with debug messages from execution of that command)
ssh -v [email protected] -o PasswordAuthentication=no -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=DEBUG -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none -o IdentitiesOnly=yes -i C:\Users\wojtek\.docker\machine\machines\worker1\id_rsa
This means that SSH server works fine on the VM and the certificate checks out. Just for completeness I tested creating a SSH session from the VM to the Windows 10 host machine, and it also works fine. Also, after reading a few posts, I turned the firewall off but the problem still persists.
Below is what I get when I ask the docker-machine to (a) list the VM docker hosts and (b) create SSH session to the new VM
C:\WINDOWS\system32>docker-machine ls
NAME ACTIVE DRIVER STATE URL SWARM DOCKER ERRORS
worker1 - hyperv Running tcp://192.168.0.108:2376 Unknown Unable to query docker version: Get https://192.168.0.108:2376/v1.15/version: x509: certificate signed by unknown authority
C:\WINDOWS\system32>docker-machine ssh worker1
exit status 255
Notice that the 255 exit status is the same as in the create script.
Why does the same SSH command fail when executed in a docker-machine script and succeeds when executed from the command line? How do I further debug the issue?
Cheers, Wojtek
It looks like the issue is between OpenSSH and the docker-machine on Windows 10.
I have installed OpenSSH to enable remote PowerShell connections with Linux boxes (see installation guidelines @ https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH).
I have uninstalled OpenSSH, removed a reference to its binary from the Path system variable, and now the "docker-machine create ..." command completes ( ?! ).
The sequence of debug messages for creating SSH connection to the new VM now looks like this:
Getting to WaitForSSH function...
(worker1) Calling .GetSSHHostname
(worker1) DBG | [executing ==>] : C:\WINDOWS\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive ( Get-VM worker1 ).state
(worker1) DBG | [stdout =====>] : Running
(worker1) DBG |
(worker1) DBG | [stderr =====>] :
(worker1) DBG | [executing ==>] : C:\WINDOWS\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive (( Get-VM worker1 ).networkadapters[0]).ipaddresses[0]
(worker1) DBG | [stdout =====>] : 192.168.0.106
(worker1) Calling .GetSSHPort
(worker1) DBG |
(worker1) DBG | [stderr =====>] :
(worker1) Calling .GetSSHKeyPath
(worker1) Calling .GetSSHKeyPath
(worker1) Calling .GetSSHUsername
SSH binary not found, using native Go implementation
&{{{<nil> 0 [] [] []} docker [0xc8a0d0] <nil> []} 192.168.0.106 22 <nil> <nil>}
About to run SSH command:
exit 0
SSH cmd err, output: <nil>:
It appears that when the script cannot locate an SSH binary it falls back on a native Go implementation (whatever that is), which works.
I will open a separate issue with the PowerShell/Win32-OpenSSH folks, and I probably should also open a separate issue like "docker-machine does not work with OpenSSH on Windows 10" in this repo.
Cheers, Wojtek
As a workaround --native-ssh option can be specified when using docker-machine commands.
Just experiencing same issue on Windows 10 Home with native windows openssh:
c:\Users\feryardiant>docker version
Client:
Version: 17.10.0-ce
API version: 1.33
Go version: go1.8.3
Git commit: f4ffd25
Built: Tue Oct 17 19:00:02 2017
OS/Arch: windows/amd64
c:\Users\feryardiant>docker-machine version
docker-machine.exe version 0.14.0, build 89b8332
Based on output of docker-machine -D restart <machine-name>
I tried to connect ssh manually using the same config and it's clear (for me) that main problem is because SSH key file permission.
ssh [email protected] -o IdentitiesOnly=yes -o PasswordAuthentication=no -i C:\Users\feryardiant\.docker\machine\machines\<machine-name>\id_rsa -p 50496
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: UNPROTECTED PRIVATE KEY FILE! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions for 'C:\\Users\\feryardiant\\.docker\\machine\\machines\\<machine-name>\\id_rsa' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
Load key "C:\\Users\\feryardiant\\.docker\\machine\\machines\\<machine-name>\\id_rsa": bad permissions
[email protected]: Permission denied (publickey,password,keyboard-interactive).
So I change the id_rsa
file permission as described on stackoverflow link above & re-try to connect again
ssh [email protected] -o IdentitiesOnly=yes -o PasswordAuthentication=no -i C:\Users\feryardiant\.docker\machine\machines\local\id_rsa -p 50496
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
@ WARNING: UNPROTECTED PRIVATE KEY FILE! @
@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@
Permissions for 'C:\\Users\\feryardiant\\.docker\\machine\\machines\\local\\id_rsa' are too open.
It is required that your private key files are NOT accessible by others.
This private key will be ignored.
Load key "C:\\Users\\feryardiant\\.docker\\machine\\machines\\local\\id_rsa": bad permissions
[email protected]: Permission denied (publickey,password,keyboard-interactive).
c:\Users\feryardiant>ssh [email protected] -o IdentitiesOnly=yes -o PasswordAuthentication=no -i C:\Users\feryardiant\.docker\machine\machines\local\id_rsa -p 50496
## .
## ## ## ==
## ## ## ## ## ===
/"""""""""""""""""\___/ ===
~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ / ===- ~~~
\______ o __/
\ \ __/
\____\_______/
_ _ ____ _ _
| |__ ___ ___ | |_|___ \ __| | ___ ___| | _____ _ __
| '_ \ / _ \ / _ \| __| __) / _` |/ _ \ / __| |/ / _ \ '__|
| |_) | (_) | (_) | |_ / __/ (_| | (_) | (__| < __/ |
|_.__/ \___/ \___/ \__|_____\__,_|\___/ \___|_|\_\___|_|
Boot2Docker version 18.05.0-ce, build HEAD : b5d6989 - Thu May 10 16:35:28 UTC 2018
Docker version 18.05.0-ce, build f150324
docker@boot2docker:~$
Now, when I want to create new docker-machine
I have to update the id_rsa
file permission in order to make it works.
Any plan to make the generated SSH key have correct file permission by default?
Apologize for poor english & hopefully that help.
Cheers 馃
Initial error
Error creating machine: Error in driver during machine creation: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded
With docker-machine --debug
:
(sparkles) DBG | Checking vm logs: /Users/David/.docker/machine/machines/sparkles/sparkles/Logs/VBox.log
(sparkles) DBG | Getting to WaitForSSH function...
(sparkles) DBG | Using SSH client type: external
(sparkles) DBG | Using SSH private key: /Users/David/.docker/machine/machines/sparkles/id_rsa (-rw-------)
(sparkles) DBG | &{[-F /dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none -o LogLevel=quiet -o PasswordAuthentication=no -o ServerAliveInterval=60 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected] -o IdentitiesOnly=yes -i /Users/David/.docker/machine/machines/sparkles/id_rsa -p 55553] /usr/bin/ssh <nil>}
(sparkles) DBG | About to run SSH command:
(sparkles) DBG | exit 0
(sparkles) DBG | SSH cmd err, output: exit status 255:
(sparkles) DBG | Error getting ssh command 'exit 0' : ssh command error:
(sparkles) DBG | command : exit 0
(sparkles) DBG | err : exit status 255
(sparkles) DBG | output :
In the VBox.log
:
00:00:29.314084 Decompressing Linux... Parsing ELF... Performing relocations... done.
00:00:29.314238 Booting the kernel.
00:00:29.314391
00:00:29.314544 --------------------------------------------------------------------------------
00:00:29.314732 !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
00:00:29.314766 emR3Debug: rc=VERR_PGM_INVALID_CR3_ADDR
00:00:30.316637 Changing the VM state from 'RUNNING' to 'GURU_MEDITATION'
00:00:30.316892 Console: Machine state changed to 'GuruMeditation'
http://www.fixedbyvonnie.com/2014/09/heck-virtualbox-guru-meditation-error/
No solution so far.
Hi @dherges... I had the same initial problem:
"_Error creating machine: Error in driver during machine creation: Too many retries waiting for SSH to be available. Last error: Maximum number of retries (60) exceeded_"
And, after all, the machine was created and I could see it running in VirtualBox. However, when I tried to ssh it I couldn't.
### What solves my problem: As "WojtekKozaczynski" described, looks that docker-machine doesn't play well with Windows' "OpenSSH", _so I removed it from the PATH variable_. Then I tried to build again the machine with "docker-machine" and it worked like a charm.
This is not the first time that I have had troubles with Windows' "OpenSSH", because at work when I tried to ssh a Linux machine it was causing some troubles for me, so I had to the same and the machine could do a successful ssh.
BTW, I'm using Windows 10 Home with Virtual Box.
@dherges if this solved your problem, try to comment it to shows others that this work, else you could facing another problem. I waste a lot of hours trying to use other approachs.
Regards!
Hello
I am also facing similar issues,
I am trying to create a docker machine with following command
docker-machine -D create nfsbox --virtualbox-no-vtx-check
This is where docker machine creation gets stuck
(nfsbox) DBG | Getting to WaitForSSH function...
(nfsbox) DBG | Using SSH client type: external
(nfsbox) DBG | Using SSH private key: /root/.docker/machine/machines/nfsbox/id_rsa (-rw-------)
(nfsbox) DBG | &{[-F /dev/null -o ConnectionAttempts=3 -o ConnectTimeout=10 -o ControlMaster=no -o ControlPath=none -o LogLevel=quiet -o PasswordAuthentication=no -o ServerAliveInterval=60 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null [email protected] -o IdentitiesOnly=yes -i /root/.docker/machine/machines/nfsbox/id_rsa -p 40819] /usr/bin/ssh <nil>}
(nfsbox) DBG | About to run SSH command:
(nfsbox) DBG | exit 0
Any suggestions to what I could be missing?
My Setup
Ubuntu 18.04 LTS
Linode VM
This may happen if you have too open permissions on your private key. I notice that with docker-machine --debug create
(nfsbox) DBG | Using SSH private key: /root/.docker/machine/machines/nfsbox/id_rsa (-rw-------)
in my case permission on private key is (-rw-------) what should it be then?
It looks like the issue is between OpenSSH and the docker-machine on Windows 10.
I have installed OpenSSH to enable remote PowerShell connections with Linux boxes (see installation guidelines @ https://github.com/PowerShell/Win32-OpenSSH/wiki/Install-Win32-OpenSSH).
I have uninstalled OpenSSH, removed a reference to its binary from the Path system variable, and now the "docker-machine create ..." command completes ( _?!_ ).
The sequence of debug messages for creating SSH connection to the new VM now looks like this:
Getting to WaitForSSH function... (worker1) Calling .GetSSHHostname (worker1) DBG | [executing ==>] : C:\WINDOWS\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive ( Get-VM worker1 ).state (worker1) DBG | [stdout =====>] : Running (worker1) DBG | (worker1) DBG | [stderr =====>] : (worker1) DBG | [executing ==>] : C:\WINDOWS\System32\WindowsPowerShell\v1.0\powershell.exe -NoProfile -NonInteractive (( Get-VM worker1 ).networkadapters[0]).ipaddresses[0] (worker1) DBG | [stdout =====>] : 192.168.0.106 (worker1) Calling .GetSSHPort (worker1) DBG | (worker1) DBG | [stderr =====>] : (worker1) Calling .GetSSHKeyPath (worker1) Calling .GetSSHKeyPath (worker1) Calling .GetSSHUsername SSH binary not found, using native Go implementation &{{{<nil> 0 [] [] []} docker [0xc8a0d0] <nil> []} 192.168.0.106 22 <nil> <nil>} About to run SSH command: exit 0 SSH cmd err, output: <nil>:
It appears that when the script cannot locate an SSH binary it falls back on a native Go implementation (whatever that is), which works.
I will open a separate issue with the PowerShell/Win32-OpenSSH folks, and I probably should also open a separate issue like _"docker-machine does not work with OpenSSH on Windows 10"_ in this repo.
Cheers, Wojtek
Thanks for sharing this but is this native Go correct deployment procedure..?
I confirm that the issue is due to incorrect permissions on the private key.
docker-machine should set the correct permissions on keys while creating machine.
Until this is fixed, set permissions manually:
icacls %USERPROFILE%\.docker\machine\machines\node1\id_rsa /inheritance:r
icacls %USERPROFILE%\.docker\machine\machines\node1\id_rsa /grant %USERNAME%:F
Just experiencing same issue on Windows 10 Home with native windows openssh:
c:\Users\feryardiant>docker version Client: Version: 17.10.0-ce API version: 1.33 Go version: go1.8.3 Git commit: f4ffd25 Built: Tue Oct 17 19:00:02 2017 OS/Arch: windows/amd64 c:\Users\feryardiant>docker-machine version docker-machine.exe version 0.14.0, build 89b8332
Based on output of
docker-machine -D restart <machine-name>
I tried to connect ssh manually using the same config and it's clear (for me) that main problem is because SSH key file permission.ssh [email protected] -o IdentitiesOnly=yes -o PasswordAuthentication=no -i C:\Users\feryardiant\.docker\machine\machines\<machine-name>\id_rsa -p 50496 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: UNPROTECTED PRIVATE KEY FILE! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Permissions for 'C:\\Users\\feryardiant\\.docker\\machine\\machines\\<machine-name>\\id_rsa' are too open. It is required that your private key files are NOT accessible by others. This private key will be ignored. Load key "C:\\Users\\feryardiant\\.docker\\machine\\machines\\<machine-name>\\id_rsa": bad permissions [email protected]: Permission denied (publickey,password,keyboard-interactive).
So I change the
id_rsa
file permission as described on stackoverflow link above & re-try to connect againssh [email protected] -o IdentitiesOnly=yes -o PasswordAuthentication=no -i C:\Users\feryardiant\.docker\machine\machines\local\id_rsa -p 50496 @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ @ WARNING: UNPROTECTED PRIVATE KEY FILE! @ @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ Permissions for 'C:\\Users\\feryardiant\\.docker\\machine\\machines\\local\\id_rsa' are too open. It is required that your private key files are NOT accessible by others. This private key will be ignored. Load key "C:\\Users\\feryardiant\\.docker\\machine\\machines\\local\\id_rsa": bad permissions [email protected]: Permission denied (publickey,password,keyboard-interactive). c:\Users\feryardiant>ssh [email protected] -o IdentitiesOnly=yes -o PasswordAuthentication=no -i C:\Users\feryardiant\.docker\machine\machines\local\id_rsa -p 50496 ## . ## ## ## == ## ## ## ## ## === /"""""""""""""""""\___/ === ~~~ {~~ ~~~~ ~~~ ~~~~ ~~~ ~ / ===- ~~~ \______ o __/ \ \ __/ \____\_______/ _ _ ____ _ _ | |__ ___ ___ | |_|___ \ __| | ___ ___| | _____ _ __ | '_ \ / _ \ / _ \| __| __) / _` |/ _ \ / __| |/ / _ \ '__| | |_) | (_) | (_) | |_ / __/ (_| | (_) | (__| < __/ | |_.__/ \___/ \___/ \__|_____\__,_|\___/ \___|_|\_\___|_| Boot2Docker version 18.05.0-ce, build HEAD : b5d6989 - Thu May 10 16:35:28 UTC 2018 Docker version 18.05.0-ce, build f150324 docker@boot2docker:~$
Now, when I want to create new
docker-machine
I have to update theid_rsa
file permission in order to make it works.Any plan to make the generated SSH key have correct file permission by default?
Apologize for poor english & hopefully that help.
Cheers 馃
This solution really works for me. The key is to change the permission of the id_rsa file as described. I firstly found the answer in a Chinese blog, and now I also found it here, so add my feedback and appreciation. Once you edit the permission of the rd_rsa file in the right VM directory, you immediately found the "Waiting for SSH to be available..." begin to move ahead, LOL.
Most helpful comment
Just experiencing same issue on Windows 10 Home with native windows openssh:
Based on output of
docker-machine -D restart <machine-name>
I tried to connect ssh manually using the same config and it's clear (for me) that main problem is because SSH key file permission.So I change the
id_rsa
file permission as described on stackoverflow link above & re-try to connect againNow, when I want to create new
docker-machine
I have to update theid_rsa
file permission in order to make it works.Any plan to make the generated SSH key have correct file permission by default?
Apologize for poor english & hopefully that help.
Cheers 馃