Vagrant: Vagrant up fails with Warning: Remote connection disconnect. Retrying...

Created on 15 May 2018  ·  26Comments  ·  Source: hashicorp/vagrant

Vagrant version

Vagrant 2.1.1

Vagrant Plugin list:
vagrant-hostmanager (1.8.8)
vagrant-hosts (2.8.1)
vagrant-proxyconf (1.5.2)
vagrant-reload (0.0.1)
vagrant-share (1.1.9)
vagrant-vbguest (0.15.1)

VirtualBox 5.2.8
Also tried VirtualBox 5.2.12

Host operating system

Windows 7 Professional 64bit

Guest operating system

centos/7 (virtualbox, 1803.01)

Vagrantfile

````

-- mode: ruby --

vi: set ft=ruby :

abbreviated brand name

BRAND_NAME = "brand"
TOP_LEVEL_DOMAIN = ".some.dev"
BOX_NAME = "centos/7"

Vagrant.configure(2) do |config|

config.vm.synced_folder ".", "/vagrant", type: "virtualbox"
config.vm.boot_timeout = 1200

if Vagrant.has_plugin?("vagrant-proxyconf")
config.proxy.http = "http://192.168.100.3:3128/"
config.proxy.https = "https://192.168.100.3:3128/"
config.proxy.no_proxy = "localhost,127.0.0.1,.some.lan"
end

config.vm.provider :virtualbox do |vb|
vb.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
vb.memory = 2048
vb.cpus = 2
vb.customize ["modifyvm", :id, "--cpuexecutioncap", "50"]
end

config.vm.define "nat" do |nat|
nat.vm.box = BOX_NAME
nat.vm.network "private_network", ip: "10.0.0.2", virtualbox__intnet: true
nat.vm.hostname = "nat.staging." + BRAND_NAME + TOP_LEVEL_DOMAIN
nat.vm.provision :hosts, :sync_hosts => true
nat.vm.provision "shell", path: "./scripts/aws_replication.sh"
nat.vm.provision "shell", path: "./scripts/nat_ssh.sh"
nat.vm.provision "shell", path: "./scripts/nat_staging.sh"
nat.vm.provision "shell", path: "./scripts/nat_git_clone.sh"
nat.vm.provision "shell", path: "./scripts/nat_set_ansible.sh"
end

config.vm.define "fei" do |fei|
fei.vm.box = BOX_NAME
fei.vm.network "private_network", ip: "10.0.0.4", virtualbox__intnet: true
fei.vm.network "forwarded_port", guest: 80, host: 80, auto_correct: true
fei.vm.network "forwarded_port", guest: 443, host: 443, auto_correct: true
fei.vm.network "forwarded_port", guest: 9000, host: 9000, auto_correct: true
fei.vm.provision :hosts, :sync_hosts => true
fei.vm.hostname = BRAND_NAME + ".staging" + TOP_LEVEL_DOMAIN
fei.vm.provision "shell", path: "./scripts/aws_replication.sh"
fei.vm.provision "shell", path: "./scripts/non-nat_key_staging.sh"
end
end
`````

Debug output

Note I removed some of the repeated blocks to reduce size as upload was failing.
(https://gist.github.com/melambers/04b452ebc07c1b77b2e2040dffd41e60)

Expected behavior

Vagrant up should build 2 centos/7 vms

Actual behavior

Nat box is created and started but then cannot connect after new key is set. Am able to connect to running box using "vagrant ssh nat" or even via direct ssh command with key path etc. Can see

`````
$ vagrant up
Bringing machine 'nat' up with 'virtualbox' provider...
Bringing machine 'fei' up with 'virtualbox' provider...
==> nat: Importing base box 'centos/7'...
==> nat: Matching MAC address for NAT networking...
==> nat: Checking if box 'centos/7' is up to date...
==> nat: Setting the name of the VM: ao_nat_1526342206425_95671
==> nat: Fixed port collision for 22 => 2222. Now on port 2200.
==> nat: Clearing any previously set network interfaces...
==> nat: Preparing network interfaces based on configuration...
nat: Adapter 1: nat
nat: Adapter 2: intnet
==> nat: Forwarding ports...
nat: 22 (guest) => 2200 (host) (adapter 1)
==> nat: Running 'pre-boot' VM customizations...
==> nat: Booting VM...
==> nat: Waiting for machine to boot. This may take a few minutes...
nat: SSH address: 127.0.0.1:2200
nat: SSH username: vagrant
nat: SSH auth method: private key
nat: Warning: Connection aborted. Retrying...
nat: Warning: Connection reset. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Connection aborted. Retrying...
nat:
nat: Vagrant insecure key detected. Vagrant will automatically replace
nat: this with a newly generated keypair for better security.
nat:
nat: Inserting generated public key within guest...
nat: Removing insecure key from the guest if it's present...
nat: Key inserted! Disconnecting and reconnecting using new SSH key...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
nat: Warning: Remote connection disconnect. Retrying...
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.

If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.

If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.

If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
`````

Steps to reproduce

  1. vagrant up

References

[/]

Most helpful comment

I had the same problem. It was caused by an incorrect permission in the file .vagrant\machines\default\virtualboxprivate_key. When I enabled read permission on the file to my user, I was able to connect.

All 26 comments

I have this exact same problem with the debian stretch image running vagrant 2.1.1 on Windows 10.

I had the same problem. It was caused by an incorrect permission in the file .vagrant\machines\default\virtualboxprivate_key. When I enabled read permission on the file to my user, I was able to connect.

This is the same exact problem I'm running into after an upgrade of vagrant. Very frustrating....

In my case it was SSH Agent. Disable it for the session

export SSH_AUTH_SOCK=""

Found a work around by inserting ' config.ssh.insert_key = false' which resolved my issue.

Got it working with @PTATH81 's suggestion.

Don't forget to vagrant destroy and explicitly delete the private_key file (you need admin access to do it, so vagrant won't do it)

@danielmurguia why did it help? When the ssh keys permissions are set incorrectly, then you see the Permission denied (key blah-blah) error.

@JuPlutonic I'm no windows expert, but I think vagrant creates the private_key file with an incorrect owner

Hi there. @danielmurguia is correct that the underlying issue is the user permissions being set on the file. A fix for this will be included in the next release. Closing this as it is being tracked here: #9900

Cheers!

@danielmurguia Where is this file located ".vagrant\machines\default\virtualboxprivate_key"

I cannot find it anywhere

You should find the .vagrant folder in the same place where you have your Vagrantfile. Inside there should be the folders machines\default\virtualbox. This structure is created the first time you run vagrant up. If the file private_key is inexistent, you may have a different bug.

I have 2 VM's, managed with 2.1.1 which are setup similarly: vagrant running in windows 10 WSL and working with a codebase that's on NTFS (so windows editors can be used) .. The windows permissions on private_key looked ok (i don't use a domain, so MACHINE/Name looks sensible), but permissions as seen from within bash are 777 (not sure if this will cause problems).

Oddly, one environment is exhibiting this behavior this before ("Remote connection disconnect. Retrying...") on a halt/up, but the other isn't. I can't see any differences in the permission setup for private_key between the two, so I'm not sure if #9900 will fix it.

Any thoughts?

I had the same issue. But after hundreds of lines

    default: Warning: Remote connection disconnect. Retrying...
    default: Warning: Connection aborted. Retrying...
    default: Warning: Connection reset. Retrying...

It booted successfully! Took about 5 minutes to boot.

This is exactly what I'm experiencing. It can take anywhere from 10 to 100
lines of that message, but does eventually boot.

On Mon, Aug 27, 2018, 11:19 AM Shawn Lin notifications@github.com wrote:

I had the same issue. But after hundreds of lines

default: Warning: Remote connection disconnect. Retrying...
default: Warning: Connection aborted. Retrying...
default: Warning: Connection reset. Retrying...

It booted successfully! Took about 5 minutes to boot.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/hashicorp/vagrant/issues/9834#issuecomment-416262636,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AArDy25lBuAuUqnWy4lDnZbXp9qWDeKWks5uVA4ZgaJpZM4T_Atg
.

I don't know if you already find a solution for this problem. I have found one:

I wrote this into the vagrant file, it show to the vagrant where is located our private and public key ssh.

config.ssh.insert_key = false
config.ssh.private_key_path = ["~/.ssh/id_rsa", "~/.vagrant.d/insecure_private_key"]
config.vm.provision "file", source: "~/.ssh/id_rsa.pub", destination: "~/.ssh/authorized_keys"
config.ssh.username = "vagrant"
config.ssh.password = "vagrant"

It worked for me!

running cmd as admin on win 10 this issue doesn't happen!

Hi, developers, I'm totally new to vagrant and I have the timeout issue on vagrant 2.2.3 OS is windows 10 and Vbox version is 5.2.22 r126460 (Qt5.6.2) the virtual machine is ubuntu 16.04. Also, the folder .ssh in my C drive empty if this is relevant by any chance.. any fix to this issue, please?

I had the same problem. It was caused by an incorrect permission in the file .vagrant\machines\default\virtualboxprivate_key. When I enabled read permission on the file to my user, I was able to connect.

Set user permissions to $ sudo chmod 644 private_key

To amend my earlier post, setting permissions in terminal to $ sudo chmod 644 private_key only worked once. Also tried 777. The problem is that each vagrant action on the private_key is changing the file permissions back to -rw-------. My synced files are no longer mounting and they have been for years. A recent box upgrade seems to have caused the problem because nothing I control changed. Creating a new issue.

Hi, developers, I'm totally new to vagrant and I have the timeout issue on vagrant 2.2.3 OS is windows 10 and Vbox version is 5.2.22 r126460 (Qt5.6.2) the virtual machine is ubuntu 16.04. Also, the folder .ssh in my C drive empty if this is relevant by any chance.. any fix to this issue, please?

I solved the issue by enabling virtualization techniques from the BOIS settings, which I thought I had done already but probably lost the settings due to so many reboots that I've made because of various issues that I had

I ran into the same problem. I had executed vagrant up before, which generated .vagrant directory, this was done with an older vagrant version, however I had to upgrade vagrant to the latest version(Because of a VirtualBox constraint, not related to this problem). Right after that when I executed vagrant up with the old generated directory I would then get the same error described. I destroyed the current VM I had(vagrant destroy -f), deleted the old '.vagrant' directory and then vagrant up worked just fine.

I keep having to destroy vagrant images to temp fix the problem, and then it comes back. I'm using ubuntu/xenial64. I usually run 2 at the same time (different mac address) on my network with different ip addresses. I run one for my dev environment and one for production, and they both can behave this way.

I ran into the same problem. I had executed vagrant up before, which generated .vagrant directory, this was done with an older vagrant version, however I had to upgrade vagrant to the latest version(Because of a VirtualBox constraint, not related to this problem). Right after that when I executed vagrant up with the old generated directory I would then get the same error described. I destroyed the current VM I had(vagrant destroy -f), deleted the old '.vagrant' directory and then vagrant up worked just fine.

Thanks @gragonmau your solution works for me 👍

@gragonmau Thanks. I downgraded VB from 6.x to 5.1.38.x, and after that vagrant started giving me problem. Now, it looks like solved. Thanks again.

it looks like in my case, I had multiple causes for this issue. in one instance, specifying a static ip was causing it. so i removed that and now I specify a static ip based on the mac address in the router.

in another more recent case, the cause was bind mounts from an nfs path. Adding _netdev to the bind in /etc/fstab fixed the problem. I also list these dependencies now as args.

x-systemd.requires=/ip:/path,x-systemd.automount,bind,_netdev

I'm going to lock this issue because it has been closed for _30 days_ ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

barkingfoodog picture barkingfoodog  ·  3Comments

dorinlazar picture dorinlazar  ·  3Comments

hesco picture hesco  ·  3Comments

bbaassssiiee picture bbaassssiiee  ·  3Comments

rrzaripov picture rrzaripov  ·  3Comments