Vagrant: Support parallel multi-machine provisioning using Ansible

Created on 2 Jun 2013  ยท  29Comments  ยท  Source: hashicorp/vagrant

Ansible has the ability to provision multiple machines in parallel. Taking advantage of this is desirable, since it is much faster than provisioning machines sequentially.

Currently (v1.2.2) Vagrant executes the provisioner for each machine separately during "vagrant up" (for multi-machine configurations). I've found no way to make it start all machines up, and then run the provisioner once. Naturally, I tried a Vagrantfile like this:

Vagrant.configure("2") do |config|

    config.vm.box = "precise64"
    config.vm.box_url = "http://files.vagrantup.com/precise64.box"

    config.vm.define :test_vm1 do |cfg|
        cfg.vm.network :private_network, ip: "172.20.30.10"
        cfg.vm.provider :virtualbox do |v|
            v.name = "test_vm1"
        end
    end

    config.vm.define :test_vm2 do |cfg|
        cfg.vm.network :private_network, ip: "172.20.30.11"
        cfg.vm.provider :virtualbox do |v|
            v.name = "test_vm2"
        end
    end

    config.vm.provision :ansible do |ansible|
        ansible.playbook = "ansible/testservers.yml"
        ansible.inventory_file = "ansible/stage"
        ansible.sudo = true
    end

end

... However, the above does not work as intended due to the way configs are merged - the global provisioning configuration is merged with the machine-specific ones, which results in running the provisoner for each machine separately when it starts. One could of course use machine-specific provisoning configurations instead of a global one (and use ansible.limit to specify the target hosts), but this loses the benefits of the parallel provisioning.

It would be nice if Vagrant allowed us to run the provisioner only _after_ all machines were brought up.

Most helpful comment

I see this has been added to the documentation, but this is not a real fix to the problem- suppose you want Ansible to provision several containers, but only want to up one at once.

This needs to be revisited.

All 29 comments

+1

+1

+1

I've talked with @mitchellh about this particular issue, and his feedback was that this wasn't possible without a core Vagrant architecture change. As well, not every machine is guaranteed to be brought up. For example, you could bring up test_vm1 and test_vm2, shutdown test_vm2, then run vagrant provision and you'd of course run into the missing machine.

It sure would be nice to have, though, so you could develop complex multi-machine playbooks like, say, an iptables playbook that allows specific traffic between machines.

How about something like this:

Vagrant.configure("2") do |config|

    config.vm.common :vm_common do |cfg|
        cfg.vm.box = "precise64"
        cfg.vm.box_url = "http://files.vagrantup.com/precise64.box"
    end

    config.vm.define :test_vm1 do |cfg|
        cfg.vm.network :private_network, ip: "172.20.30.10"
        cfg.vm.provider :virtualbox do |v|
            v.name = "test_vm1"
        end
    end

    config.vm.define :test_vm2 do |cfg|
        cfg.vm.network :private_network, ip: "172.20.30.11"
        cfg.vm.provider :virtualbox do |v|
            v.name = "test_vm2"
        end
    end

    config.vm.provision :ansible do |ansible|
        ansible.playbook = "ansible/testservers.yml"
        ansible.inventory_file = "ansible/stage"
        ansible.sudo = true
    end

end

I.e. add support for a special _'common'_ block (first one above). If it is present, then don't perform the standard merge of the global config object into the machine-specific ones (merge the 'common' one instead). This would not be a core architecture change, but would still allow us to use parallel provisioning if we wanted to.

Actually, Vagrant 1.2 comes with support for provider parallelization. While I can't support ansible parallelization without a big architecture change, I can support just invoking ansible in parallel. The issue is that VirtualBox currently hates this sort of virtualization. I plan on addressing that soon.

So I'd say this will be solved when VirtualBox provider is parallelizable.

@mitchellh, this issue was only about parallelizing the provisioner. This is a separate problem, not related to parallelizing the machines' startup, and thus not related to VirtualBox. It's acceptable to bring the machines up one after another, as long as I can run the provisioner only once after all of them are up.

Regarding the Ansible parallelization (provisioner only), did you see the proposed syntax in my previous comment? Why do you think it is a big architecture change?

RIght, I'm saying that paralllelizing the provisioner across machines like that would require a huge architecture change to Vagrant, and that the far simpler option is to just parallelize the VirtualBox provider, which would achieve the same goal.

The syntax above is already valid, so making that behave in a different way would be unexpected, and require quite a huge change. Currently, Vagrant will apply that provisioner to all VMs.

@pesho I do this by adding the provisioning block to only the last vm - they are processed sequentially, so all going well, by the time the provisioner runs, all machines will be available

config.vm.define :varnish do |varnish_config|
  varnish_config.vm.network :private_network, ip: "192.168.50.2"
end

config.vm.define :drupal do |drupal_config|
  drupal_config.vm.network :private_network, ip: "192.168.50.3"
end

config.vm.define :queue do |queue_config|
  queue_config.vm.network :private_network, ip: "192.168.50.4"
end

config.vm.define :elasticsearch do |elasticsearch_config|
  elasticsearch_config.vm.network :private_network, ip: "192.168.50.5"

  # Provisioning only on last machine since ansible deals with multiple hosts
  elasticsearch_config.vm.provision :ansible do |ansible|
    ansible.playbook = "ansible/site.yml"
    ansible.inventory_file = "ansible/vagrant_hosts"
  end
end

The vagrant_hosts file contains the list groups and associated private IPs, so you can let ansible deploy to all hosts, even though vagrant thinks it is only running it on the final machine. It's not perfect but it works

I use the same workaround as @georgewhewell ... found this bug hoping for a better solution.

I'm also using the same workaround described above until a better solution can be worked out...

... and I'll start looking at using a provider other than virtual box that supports parallelized vm provisioning.

This workaround seems to run the playbook against all hosts, but with a slight difference.

I have some cluster roles which do things like create an /etc/hosts with all vms in a cluster.
/templates/etc_hosts

{% for host in groups['cluster'] %}

{{hostvars[host]['mapped_interface_1'] + ' ' + host+dns.fqdn_suffix + ' ' + host}}

{% endfor %}

This works great when i run ansible-playbook from commandline.
But using the workaround (or not) from within a vagrant file it only puts the current vm in /etc/hosts.

Not sure why this would be, but till i can figure it out, my workaround is to use a shell provisioner on the last vm to call ansible-playbook from the commandline. hack^2

+1 just lost many hours to this problem. :(

:+1: Thanks for the work around also! I have lost the same bunch of hours to this

I've also got this working if you're using a loop to set up machines. It basically runs provision on everything on the file VM.

VAGRANTFILE_API_VERSION = "2"

base_dir = File.expand_path(File.dirname(__FILE__))
cluster = {
  "mesos-master1" => { :ip => "100.0.10.11",  :cpus => 1, :mem => 1024 },
  #"mesos-master2" => { :ip => "100.0.10.12",  :cpus => 1, :mem => 1024 },
  #"mesos-master3" => { :ip => "100.0.10.13",  :cpus => 1, :mem => 1024 },
  "mesos-slave1"  => { :ip => "100.0.10.101", :cpus => 1, :mem => 256 },
  "mesos-slave2"  => { :ip => "100.0.10.102", :cpus => 1, :mem => 256 },
  "mesos-slave3"  => { :ip => "100.0.10.103", :cpus => 1, :mem => 256 },
  "kafka-node1"   => { :ip => "100.0.20.101", :cpus => 1, :mem => 1536 },
}

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|

  if Vagrant.has_plugin?("vagrant-cachier")
    config.cache.scope = :machine
    config.cache.enable :apt
  end

  cluster.each_with_index do |(hostname, info), index|
    config.vm.define hostname do |cfg|

      cfg.vm.provider :virtualbox do |vb, override|
        override.vm.box = "trusty64"
        override.vm.box_url = "https://cloud-images.ubuntu.com/vagrant/trusty/current/trusty-server-cloudimg-amd64-vagrant-disk1.box"
        override.vm.network :private_network, ip: "#{info[:ip]}"
        override.vm.hostname = hostname

        vb.name = 'vagrant-mesos-' + hostname
        vb.customize ["modifyvm", :id, "--memory", info[:mem], "--cpus", info[:cpus], "--hwvirtex", "on" ]
      end

      # provision nodes with ansible
      if index == cluster.size - 1
        cfg.vm.provision :ansible do |ansible|
          ansible.verbose = "vvvv"

          ansible.inventory_path = "inventory/vagrant"
          ansible.playbook = "cluster.yml"
          ansible.limit = 'all'# "#{info[:ip]}" # Ansible hosts are identified by ip
        end # end provision
      end #end if

    end # end config

  end #end cluster

end #end vagrant

+1 a lot !

You can always set somevm_cfg.ssh.insert_key = false to all the VMs. There also might be a way to generate key only once.

And it's not a good idea to put provision section to a specific vm as you won't be able to provision individually. There must be an easy way to take the first element but I've simply added iteration limit:

  i = 0;
  config.vm.provision :ansible do |ansible|
    if i <= 0
      ansible.playbook = "playbook.yml"
      ansible.sudo = true
      ansible.limit = 'all'
      i+=1;
    end
  end

@hryamzik's solution doesn't fit the problem here, as Vagrant will indeed call the provisioner once, but after setting up the first machine, which means all machines started after that will not be provisioned.

I was thinking for a minute of a similar solution but to run the provisioner only after the last machine, but couldn't find anything. So I'll keep the current hack of having the provision section within the last vm until we can do better :-(

Interested to know if there has been any progress to this or anything on the vagrant roadmap ?

@ksingh7, @astorije, as there's one more issue with vagrant integration I've finally moved to a 2-step solution:

  1. Ansible called from vagrant just installs this role.
  2. Folder contains the following executable script called vagrant.py:

``` python
#!/usr/bin/python
import json
import string
import os
import argparse
import glob

parser = argparse.ArgumentParser(description='Process ansible inventory options')
parser.add_argument("-l", "--list", action='store_true', help="list of groups" )
parser.add_argument("-H", "--host", help="dictionary of variables for host")

args = parser.parse_args()

def prettyprint(string):
print json.dumps(string, indent=4, sort_keys=True)

def getClients():
clientListString = os.popen("grep ssh .vagrant/provisioners/ansible/inventory/vagrant_ansible_inventory|tr '=' ' '").read()
clients = {}
for clientString in clientListString.split('\n'):
clientVars = clientString.split()
if len(clientVars) == 5:
c={}
name, _, c['ansible_ssh_host'], _, c['ansible_ssh_port'] = clientVars
clients[name] = c
return clients

clients=getClients()

if args.list:
hostlist = {
"_meta" : {
"hostvars": clients
},
"all": clients.keys(),
}
prettyprint(hostlist)

elif args.host:
try:
prettyprint( clients[args.host] )
except:
pass
else:
prettyprint(clients)
```

  1. local ansible.cfg:

``` yaml
[defaults]
retry_files_save_path = /tmp/retry/
ansible_managed = Ansible managed: file modified on %Y-%m-%d %H:%M:%S by {uid} on {host}
hostfile = ./vagrant.py
roles_path = ../../

[ssh_connection]
ssh_args = -o StrictHostKeyChecking=no
```

So you can then just run ansible--playbook with all the command-line options.

I ran into this and solved it for my own needs. My case seems to be a little different than most as my multi-machine environment is typically only partially up and I wanted to be able to run ansible on a single machine ($ vagrant provision some.host), or all of them machines that are up in parallel ($ vagrant provision). I did this by adding this before the Vagrant.configure block:

$ansible_already_ran = false
class Vagrant::Plugin::V2::Provisioner

  alias_method :original_initialize, :initialize
  def initialize(machine, config)
    original_initialize(machine, config)

    VagrantPlugins::Ansible::Provisioner.class_eval do
      alias_method :original_provision, :provision
      def provision
        if config.limit == 'all'
          if !$ansible_already_ran
            original_provision
          else
            @machine.env.ui.warn 'Skipping provisioning after running with -l all'
          end
        else
          original_provision
        end
        $ansible_already_ran = true
      end
    end

  end
end

and this bit to the ansible provisioner section inside the config.vm.provision 'ansible' block:

if ARGV.index 'provision' and ARGV.count == 1
  ansible.limit = 'all'
end

I don't really know ruby so there may be a better way to do it, but it works well enough for what I wanted. Maybe others will also find it useful if they see this issue in Google.

Putting the provision block in the last machine definition and setting limit to all worked for me:

# -*- mode: ruby -*-
# vi: set ft=ruby :
Vagrant.configure(2) do |config|
  config.vm.box = "ubuntu/trusty64"

  config.vm.provider "virtualbox" do |provider|
    provider.memory = "256"
  end

  (0..1).each do |n|
    config.vm.define "ansible-demo-app#{n}" do |define|
      define.ssh.insert_key = false
      define.vm.hostname = "ansible-demo-app#{n}"
      define.vm.network :private_network, ip: "10.0.15.2#{n}"
      define.vm.synced_folder ".", "/home/vagrant/work/src/app", group: "vagrant", owner: "vagrant"
    end
  end

  config.vm.define "ansible-demo-lb0" do |define|
    define.ssh.insert_key = false
    define.vm.hostname = "ansible-demo-lb0"
    define.vm.network :private_network, ip: "10.0.15.10"
    define.vm.synced_folder ".", "/vagrant", disabled: true

    define.vm.provision "ansible" do |provision|
      provision.limit = 'all'
      provision.playbook = "inf/site.yml"

      provision.groups = {
        "app" => [
          "ansible-demo-app0",
          "ansible-demo-app1"
        ],
        "lb" => ["ansible-demo-lb0"]
      }
    end
  end
end

The idea being that all machines will be setup before provisioning and then Ansible will gather facts from all the machines in setup.

I see this has been added to the documentation, but this is not a real fix to the problem- suppose you want Ansible to provision several containers, but only want to up one at once.

This needs to be revisited.

After reading this issue and a few tries, I ended up with this solution that runs ansible on all or specified hosts.
This works with "vagrant up [optional list of hosts]" as well as with "vagrant provision [optional list of hosts]".

1st ensure that ansible is only executed once (modified the version above to work with newer vargrant):

$ansible_already_ran = false
class Vagrant::Plugin::V2::Provisioner

  alias_method :original_initialize, :initialize
  def initialize(machine, config)
    original_initialize(machine, config)

    VagrantPlugins::Ansible::Provisioner::Host.class_eval do
      alias_method :original_provision, :provision
      def provision
        if $ansible_already_ran
          @machine.env.ui.warn "Ansible already ran"
        else
          original_provision
          $ansible_already_ran = true
        end
      end
    end

  end
end

2nd collect a list of all hosts that are affected by this vagrant call and pass it as limit to ansible:

  ansible_hosts = []
  config.vm.define "machine1" do |machine|
    ansible_hosts.push("machine1")
  end
  config.vm.define "machine2" do |machine|
    ansible_hosts.push("machine2")
  end
  config.vm.define "machine3" do |machine|
    ansible_hosts.push("machine3")
  end

  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "myplaybook.yml"
    ansible.limit = ansible_hosts
  end

The only thing not 100% working as expected is:

vagrant up --provision will provision ALL machines at least one is not running and will provision NO machines, if all are running.

The solution we have been using to this is to run Ansible completely outside of Vagrant. Virtualbox is still not parallelized and the workarounds in here are far more hacky than just having a bin/provision.sh script to run the playbook once Vagrant has provisioned. Having a serial Ansible provisioner is completely useless except in very basic provisioning steps. Vagrant simulates environments, not single machines, so orchestration is a big deal.

@Faheetah I have a dynamic inventory script for this case.

I've used the solution of @theclaymethod and threw it on my Vagrantfile. It doesn't look that hacky at all for me but within the scope of ruby programming within Vagrantfiles. Just to show another example how this could be done:

...

N = 3
(1..N).each do |machine_id|

    config.vm.define "#{HOSTNAME}#{machine_id}" do |host|
      host.vm.box = 'centos/7'
      host.vm.network 'private_network', type: 'dhcp'
      host.vm.hostname = "#{HOSTNAME}#{machine_id}.vagrant"
      if machine_id == N
        host.vm.provision 'bootstrap', type: 'ansible', run: 'once' do |ansible|
          ansible.compatibility_mode = '2.0'
          ansible.limit = 'all'
          ansible.playbook = 'ansible/bootstrap.yml'
        end
        host.vm.provision 'common', type: 'ansible', run: 'once' do |ansible|
          ansible.compatibility_mode = '2.0'
          ansible.limit = 'all'
          ansible.playbook = 'ansible/common.yml'
        end
        host.vm.provision 'site', type: 'ansible', run: 'always' do |ansible|
          ansible.compatibility_mode = '2.0'
          ansible.limit = 'all'
          ansible.playbook = 'ansible/site.yml'
        end
      end
    end

  ...

  end

I'll leave my solution here, hoping it helps someone else too. :)

Based on @moenka's example, I looked for a solution to get the build process more dynamically. Therefore you just define your boxes in an array which will be iterated over in the build process. The final box is size - 1 and the trigger for the Ansible provision.

Additional probs go out to @geerlingguy for his great book "Ansible for DevOps"!

# Vagrantfile

boxes = [
    {
        :name     => "vm1",
        :hostname => "vm1.example.local",
        :ports    => [80, 443],
        :memory   => 1024,
        :cpus     => 1
    },
    {
        :name     => "vm2",
        :hostname => "vm2.example.local",
        :ports    => [80, 443],
        :memory   => 2048,
        :cpus     => 1
    },
]

VM_BASE_IP_NET = "192.168.10."
vm_base_ip  = 10
vm_base_port = 8000

Vagrant.configure("2") do |config|

  # Base Vagrant VM configuration
  config.vm.box = "debian/stretch64"
  config.ssh.insert_key = false
  config.vm.synced_folder ".", "/vagrant", disabled: true
  config.vm.provider :virtualbox do |v|
    v.linked_clone = true
  end

  # Configure all VMs
  boxes.each_with_index do |box, index|
    config.vm.define box[:name] do |box_config|
      box_config.vm.hostname = box[:hostname]
      box_config.vm.network "private_network",
                            ip: VM_BASE_IP_NET + (vm_base_ip += 1).to_s
      box[:ports].each do |port|
        box_config.vm.network "forwarded_port",
                              guest: port,
                              host: vm_base_port += 1,
                              autocorrect: true
      end
      box_config.vm.provider "virtualbox" do |v|
        v.memory = box[:memory]
        v.cpus = box[:cpus]
      end

      # only start ansible provision after the last box
      if index == boxes.size - 1
        # PROVISIONING WITH ANSIBLE
        # ------------------------------------------------------------------------
        box_config.vm.provision "ansible" do |ansible|
          ansible.inventory_path = "provision/hosts.yml"
          ansible.limit = "development"
          ansible.playbook = "provision/site.yml"
          ansible.raw_arguments = ["--private-key=~/.vagrant.d/insecure_private_key"]
        end
      end
    end
  end
end

# provision/hosts.yml

all:
  hosts:
  children:
    development:
      hosts:
        vm1.example.local:
        vm2.example.local:
    staging:
      hosts:
        staging1.example.com:
        staging2.example.com:
    production:
      hosts:
        production1.example.com:
        production2.example.com:
# provision/site.yml

- hosts: all
  roles:
    - common
    - ntp
  become: yes

- hosts: vm1.example.local
  roles:
    - nginx
  become: yes

- hosts: vm2.example.local
  roles:
    - apache2
  become: yes

I'm going to lock this issue because it has been closed for _30 days_ โณ. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

RobertSwirsky picture RobertSwirsky  ยท  3Comments

StefanScherer picture StefanScherer  ยท  3Comments

tomhking picture tomhking  ยท  3Comments

luispabon picture luispabon  ยท  3Comments

rrzaripov picture rrzaripov  ยท  3Comments