wait_for
v2.2
Hi,
I have the below as a part of my playbok to upgrade all system packages, reboot the machine and wait for it to come back. The ansible playbook exits when machine reboots and is not waiting for the host to come back online and run the remaining playbook. Can you please suggest?
- name: reboot the system when package is upgraded
command: /sbin/shutdown -r now "Ansible system package upgraded"
when: latest_state.changed
tags: upgrade_packages_all
- name: waiting for server to come back
local_action: wait_for host={{ ansible_default_ipv4.address }} port=22 state=started delay=30 timeout=60
sudo: false
tags: upgrade_packages_all
TASK [vmsetup : reboot the system when package is upgraded] ********************
fatal: [96.119.246.13]: FAILED! => {"changed": false, "failed": true, "module_stderr": "", "module_stdout": "PolicyKit daemon disconnected from the bus.\r\nWe are no longer a registered authentication agent.\r\n", "msg": "MODULE FAILURE", "parsed": false}
Reboot works but unusable playbook lost it connection as shown with above error.
#tail -f /var/log/messages
Feb 10 16:25:30 nrpe[872]: Daemon shutdown
Connection to xx.xxx.xxx.xx closed by remote host.
Connection to xx.xxx.xxx.xx closed.
Let me know if any details required. Thanks.
Thanks,
Govind
add && sleep 1
shell: /sbin/shutdown -r now "Ansible system package upgraded" && sleep 1
as a workaround to avoid the connection shutting before Ansible can 'reap' the temp files and close the connection.
Just as info, this is documented at https://support.ansible.com/hc/en-us/articles/201958037-Reboot-a-server-and-wait-for-it-to-come-back although not maintained in this repos docs, and does not show up at docs.ansible.com
@bcoca Added as you said but still ran into same error. I have to use ignore_errors: true to skip that error.
Error:
fatal: [96.119.246.13]: FAILED! => {"changed": false, "failed": true, "module_stderr": "", "module_stdout": "PolicyKit daemon disconnected from the bus.r\nWe are no longer a registered authentication agent.r\n", "msg": "MODULE FAILURE", "parsed": false}
I had same problem with 2.0.0.2, this workaround helped me:
- name: Wait for server come back
wait_for: >
host="{{ inventory_hostname }}"
port=22
delay=15
timeout=60
delegate_to: localhost
@gvenka008c:
You may want to try:
shell: sleep 2 && /sbin/shutdown -r now
@andyhky it worked!! :) thanks! Final solution in Ansible 2.1 that works is as follows
- name: Restart server
become: yes
shell: sleep 2 && /sbin/shutdown -r now "Ansible system package upgraded"
- name: waiting 30 secs for server to come back
local_action: wait_for host={{ ansible_default_ipv4.address }} port=22 state=started delay=30 timeout=60
become: false
@sayantandas does the solution still work for you? I am using ansible 2.1.1.0 and get the following:
UNREACHABLE! => {"changed": false, "msg": "SSH Error: data could not be sent to the remote host. Make sure this host can be reached over ssh", "unreachable": true}
I found this answer that solve the problem for me : http://stackoverflow.com/a/39174307
- name: Restart server
become: yes
shell: sleep 2 && /sbin/shutdown -r now "Ansible system package upgraded"
async: 1
poll: 0
The local_action
following the shell
reboot always skips for me. A peek at the -vvv
output only indicates that it was skipped because of a conditional. Anyone else experiencing this? I can open a new ticket if its seemingly un-related.
I can confirm this broke completely on 2.1 in our install. We had it working on 1.9 in the "1.9" way, upgraded Ansible to 2.1, modified the task to the "2.1" way, and it breaks every time.
This solution kinda works with my install. However, the local_action
waits the whole timeout everytime : my host can restart in ~30 seconds, but if I set the wait_for
timeout to 3600, Ansible will wait one hour before proceeding with the playbook... As some reboots may be longer than others (updates), I really need to have a high timeout, but can't afford wasting 15 minutes for my hosts to come back (happens 5 times in my main playbook :( )
After a bit of trial and error with various solutions posted for various versions, the following is working for me on 2.1.2 with an Ubuntu 16.04 guest VM and OS X host using Vagrant (1.8.6) and VirtualBox (5.1.8).
- name: "Reboot if required"
shell: sleep 2 && shutdown -r now 'Reboot required' removes=/var/run/reboot-required
become: true
async: 1
poll: 0
ignore_errors: true
- name: "Wait for reboot"
local_action: wait_for host={{ ansible_default_ipv4.address }} port=22 delay=10 state=started
become: false
@Furiml: Not sure if this applies to what you're trying to do, but this second task will poll every 10 seconds (default) after a 10 second delay to see if port 22 on the guest machine is open before continuing i.e. it won't take the full allocated timeout value.
An update of the docs and/or the support article to use the preferred full YAML format for tasks would also be nice. This works for me:
- name: reboot nodes
shell: sleep 2 && shutdown -r now "Ansible reboot"
async: 1
poll: 0
ignore_errors: true
- name: wait for server to come back
local_action: wait_for
args:
host: "{{ inventory_hostname }}"
port: 22
state: started
delay: 30
timeout: 300
I wrote something else to test this. Instead of waiting for an host to be up, I want to wait for it to be down.
- name: "Wait for the machine to be down"
local_action: wait_for
args:
host={{target}}
port=22
state=stopped
delay=1
timeout=3600
become: false
If I understood well, this will poll the port 22 of my target every second and will only continue if it is closed. I shutdown the machine myself, but Ansible is stuck for 5 minutes now :(
@martineg that works great! It's now included in the Galaxy role jmcvetta.debian-upgrade-reboot
.
On ansible 2.2 this does not reboot my computer. It simply says that job is started, and then waits for 22 port. But node does not reboot!
I have the same issue as @sashgorokhov on Ubuntu 16.04/ansible 2.2.1.0.
Just says "OK" and doesn't reboot.
ok: [IP] => {
"ansible_job_id": "575686775528.32762",
"changed": false,
"finished": 0,
"results_file": "/root/.ansible_async/575686775528.32762",
"started": 1
}
@NoahO maybe this tiny code snippet could help you:
tasks:
- shell: shutdown -r now
This simply reboots the node without waiting for it (in my case I really dont need to wait for it to reboot)
@sashgorokhov unfortunately I need it to reboot, Worked around it with at command, but wastes 1 minute before taking any action, so I'd prefer to have this working.
Trying to get this working on Centos 7.3 servers from F25 workstation with Ansible 2.2.1, but doesn't seems to be working. Any workaround?
At this point I seriously consider to create a separate role for the task I need to execute after reboot and call the 2 roles from a shell script with a long enough sleep in between or if I want to be fancy I can add an ssh-keyscan as well to make sure the server is up.. But would rather rely on Ansible, you know, as it is a real automation tool ;)
EDIT:
Ok I was an utter idiot and a bit close to midnight here I don't afraid to admit it...WRONG INVENTORY FFS. Works! Sorry...
Having the same problem. Any updates on this matter? Looks like many folks are facing this issue
This is still a problem (Mac version 2.3.0.0), target is a Fedora Instance in AWS. None of the above workarounds worked for me (the wouldn't error, but also didn't reboot it) so I did the following (where delayed_reboot is just a shell script, sleep and reboot):
- copy:
src: files/delayed_reboot
dest: /tmp/delayed_reboot
owner: root
group: root
mode: 0700
- name: Restart machine
shell: nohup /tmp/delayed_reboot &
async: 1
poll: 0
ignore_errors: true
become: true
become_method: sudo
when: new_kernel.changed or new_kernel_headers.changed
- name: Wait for machine to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=20
timeout=300
state=started
become: false
when: new_kernel.changed or new_kernel_headers.changed
Shared connection
ansible 2.3.0.0
config file = /Users/ebalduf/PD-git/LabOnDemand/ansible.cfg
configured module search path = Default w/o overrides
python version = 2.7.13 (default, Dec 18 2016, 07:03:39) [GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.42.1)]
grep '^[^#]' ansible.cfg
[defaults]
host_key_checking = False
timeout = 15
[privilege_escalation]
[paramiko_connection]
[ssh_connection]
control_path = %(directory)s/%%h-%%r
[persistent_connection]
[accelerate]
[selinux]
[colors]
[diff]
Ansible host: macOS Sierra 10.12.4
target: Fedora 25 instance in AWS.
- name: install python and deps for ansible modules
raw: dnf install -y python2 python2-dnf libselinux-python
- name: gather facts
setup:
- name: Install new Kernel
dnf:
name: https://kojipkgs.fedoraproject.org//packages/kernel/4.9.13/201.fc25/x86_64/kernel-core-4.9.13-201.fc25.x86_64.rpm
register: new_kernel
- name: Install new Kernel headers
dnf:
name: https://kojipkgs.fedoraproject.org//packages/kernel/4.9.13/201.fc25/x86_64/kernel-headers-4.9.13-201.fc25.x86_64.rpm
register: new_kernel_headers
- name: Restart machine
command: reboot
async: 1
poll: 0
ignore_errors: true
become: true
become_method: sudo
when: new_kernel.changed or new_kernel_headers.changed
- name: Wait for machine to restart
local_action:
module: wait_for
host={{ inventory_hostname }}
port=22
delay=20
timeout=300
state=started
become: false
when: new_kernel.changed or new_kernel_headers.changed
The target should reboot properly and ansible continue the playbook.
See output below with -vvv
Using module file /usr/local/lib/python2.7/site-packages/ansible/modules/commands/command.py
<34.209.10.206> ESTABLISH SSH CONNECTION FOR USER: fedora
<34.209.10.206> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=fedora -o ConnectTimeout=15 -o ControlPath=/Users/ebalduf/.ansible/cp/%h-%r 34.209.10.206 '/bin/sh -c '"'"'echo ~ && sleep 0'"'"''
<34.209.10.206> (0, '/home/fedora\n', '')
<34.209.10.206> ESTABLISH SSH CONNECTION FOR USER: fedora
<34.209.10.206> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=fedora -o ConnectTimeout=15 -o ControlPath=/Users/ebalduf/.ansible/cp/%h-%r 34.209.10.206 '/bin/sh -c '"'"'( umask 77 && mkdir -p "` echo /home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672 `" && echo ansible-tmp-1493487050.48-176600574616672="` echo /home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672 `" ) && sleep 0'"'"''
<34.209.10.206> (0, 'ansible-tmp-1493487050.48-176600574616672=/home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672\n', '')
<34.209.10.206> PUT /var/folders/sd/5jlrqcms5qg3bjc0g5mp5r1r0000gn/T/tmpeV4QiT TO /home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672/command.py
<34.209.10.206> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=fedora -o ConnectTimeout=15 -o ControlPath=/Users/ebalduf/.ansible/cp/%h-%r '[34.209.10.206]'
<34.209.10.206> (0, 'sftp> put /var/folders/sd/5jlrqcms5qg3bjc0g5mp5r1r0000gn/T/tmpeV4QiT /home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672/command.py\n', '')
<34.209.10.206> ESTABLISH SSH CONNECTION FOR USER: fedora
<34.209.10.206> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=fedora -o ConnectTimeout=15 -o ControlPath=/Users/ebalduf/.ansible/cp/%h-%r 34.209.10.206 '/bin/sh -c '"'"'chmod u+x /home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672/ /home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672/command.py && sleep 0'"'"''
<34.209.10.206> (0, '', '')
<34.209.10.206> ESTABLISH SSH CONNECTION FOR USER: fedora
<34.209.10.206> SSH: EXEC ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=fedora -o ConnectTimeout=15 -o ControlPath=/Users/ebalduf/.ansible/cp/%h-%r -tt 34.209.10.206 '/bin/sh -c '"'"'sudo -H -S -n -u root /bin/sh -c '"'"'"'"'"'"'"'"'echo BECOME-SUCCESS-jplodcrkimvnywjebybiuhwijxipglmt; /usr/bin/python /home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672/command.py; rm -rf "/home/fedora/.ansible/tmp/ansible-tmp-1493487050.48-176600574616672/" > /dev/null 2>&1'"'"'"'"'"'"'"'"' && sleep 0'"'"''
<34.209.10.206> (255, '', 'Shared connection to 34.209.10.206 closed.\r\n')
fatal: [34.209.10.206]: UNREACHABLE! => {
"changed": false,
"unreachable": true
}
MSG:
Failed to connect to the host via ssh: Shared connection to 34.209.10.206 closed.
Thanks @ebalduf for putting all this together! I can also confirm that I am also facing identical issue with ansible 2.2.1 with MacOS/CentOS as ansible hosts and CentOS7 as target host. It would be nice if this bug can be prioritized!
The below code works for me
Ansible version - 2.3
Server - Ubuntu 16.04.2 LTS
Target system - RHEL 7.3
- name: restart server
become: yes
shell: sleep 2 && /sbin/shutdown -r now "RedHat system package upgraded"
async: 1
poll: 0
- name: waiting 60 secs for server to come back
become: false
local_action: wait_for host={{ ansible_default_ipv4.address }} port=22 state=started delay=60 timeout=120
The solution provided by @sayantandas also works for us.
Ansible version: 2.3.0.0
Server version: CentOS Linux release 7.3.1611
Target system: CentOS Linux release 7.3.1611
The solution provided by @sayantandas works for me too
Ansible version: 2.3.0.0
Server version: RHEL 7.3
Target system: RHEL 7.3
Thank you
Ansible 2.3, Centos7, below is what I went with after I do something that updates the kernel, avoids the 'wait for host to boot' if the host isn't rebooting.
- name: Check for reboot hint.
shell: LAST_KERNEL=$(rpm -q --last kernel | perl -pe 's/^kernel-(\S+).*/$1/' | head -1);CURRENT_KERNEL=$(uname -r); if [ $LAST_KERNEL != $CURRENT_KERNEL ]; then echo 'reboot'; else echo 'no'; fi
ignore_errors: true
register: reboot_hint
- name: Rebooting ...
shell: sleep 2 && /usr/sbin/reboot
async: 1
poll: 0
ignore_errors: true
when: reboot_hint.stdout.find("reboot") != -1
- name: Wait for host to boot
become: false
local_action: wait_for
args:
host: "{{ inventory_hostname }}"
port: 22
state: started
delay: 30
timeout: 180
when: reboot_hint.stdout.find("reboot") != -1
Unable to reboot properly, even with sanity checks.
Ansible 2.2.2.0
Example playbook for Ubuntu 16.04 LTS
---
- name: Refresh apt cache
apt:
update_cache: yes
- name: Update all packages
apt:
upgrade: dist
- name: Rebooting server
shell: >
sleep 2 &&
/sbin/shutdown -r now "Ansible system package upgraded"
async: 1
poll: 0
ignore_errors: true
- name: Wait for host to boot
become: false
local_action: wait_for
args:
host: "{{ inventory_hostname }}"
port: 22
state: started
delay: 30
timeout: 200
- name: Sanity check
shell: ps -ef | grep sshd | grep `whoami` | awk '{print \"kill -9\", $2}' | sh
async: 1
poll: 0
ignore_errors: true
- name: Remove useless packages from the cache
apt:
autoclean: yes
- name: Remove dependencies that are no longer required
apt:
autoremove: yes
Result
TASK [apt-refresh : Remove useless packages from the cache] ********************
fatal: [xxxx]: FAILED! => {"changed": false, "failed": true, "module_stderr": "OpenSSH_7.2p2 Ubuntu-4ubuntu2.2, OpenSSL 1.0.2g 1 Mar 2016\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug1: /etc/ssh/ssh_config line 19: Applying options for *\r\ndebug1: auto-mux: Trying existing master\r\ndebug1: mux_client_request_session: master session id: 2\r\nShared connection to xxxx closed.\r\n", "module_stdout": "Traceback (most recent call last):\r\n File \"/tmp/ansible_MJ_gDg/ansible_module_apt.py\", line 903, in <module>\r\n main()\r\n File \"/tmp/ansible_MJ_gDg/ansible_module_apt.py\", line 855, in main\r\n for package in packages:\r\nTypeError: 'NoneType' object is not iterable\r\n", "msg": "MODULE FAILURE"}
ansible.cfg
[defaults]
inventory = hosts
host_key_checking = False
remote_user = ubuntu
private_key_file = id_rsa
retry_files_enabled = False
[ssh_connection]
ssh_args = -C -o ControlMaster=auto -o ControlPersist=60s
control_path = /tmp/%%h-%%p-%%r
I was having similar problems. I added a 'pause' task for 30 seconds inbetween the shell: shutdown now -r and the wait_for task. Now things consistently work. I also have these as handlers with listen, so they only run when needed.
I had similar issues when trying to reboot our Ubuntu 16.04 hosts with ansible configured to use python3. As soon as I installed python 2.7 on the Ubuntu 16.04 hosts (apt-get install python-minimal) and configured ansible to use it on the remote system, the reboot worked fine.
Whenever I used "async" in my task, exactly nothing happened when ansible used python3, not even very basic things like "echo test > /tmp/testfile".
Addition: I'm using ansible 2.3.1.0 installed via deb package from http://ppa.launchpad.net/ansible/ansible/ubuntu
So there is a wait_for_connection action plugin that will wait until the system becomes available again and validates this by doing an end-to-end test. This is more reliable than testing if a port is responding again.
We are working on a new reboot action plugin that will perform a reboot, will wait for the connection to start working again and finally checks if the system was actually rebooted.
cc @AnderEnder @gregswift @jarv @jhoekx
click here for bot help
Target system - CentOS 7.4
- name: restart server
become: yes
shell: sleep 2 && /sbin/shutdown -r now "System reboot"
#command: /usr/bin/systemd-run --on-active=10 /usr/bin/systemctl reboot
async: 1
poll: 0
- name: waiting 10 secs for server to come back
become: false
local_action: wait_for host={{ ansible_default_ipv4.address }} port={{ ansible_port }} state=started delay=10 timeout=120
@peterwillcn IMO You better use wait_for_connection instead of wait_for, see: http://docs.ansible.com/ansible/latest/wait_for_connection_module.html
It's not just easier, it also works over jumphosts or proxies, using the exact same transport Ansible uses for the target node.
@dagwieers how is the reboot action plugin coming along ?
Found a couple roles out there to take care of this:
@afeld that role looks great.
So my issue with all the examples I have seen is that it all relies on some random wait time before starting the poll to see if ssh port is available. Now given different hosts require differing amounts of time to shutdown processes depending on what it’s doing - it either means you have to set a long delay to catch the worst offender or you risk false positives.
The new wait_for_connection just uses ping and another random delay factor (see above). So again huge risk of false positives (confirmed by redhat support).
The way I have made this slightly more robust is using 2 tasks - first one waits for ssh port to be absent - this starts immediately and has maximum wait of 15 mins, polls every second - this should be plenty of time for server processes to shutdown and means that you should only have to wait for regular os services to stop.
The second ssh not running It starts task 2 - wait for ssh port state - started after 1 min delay.
Note the wait_for port doesn’t rely on ssh it uses a python socket to determine if port is up
Andy
Sent from my iPhone
On 12 Dec 2017, at 05:41, Shaun Smiley notifications@github.com wrote:
@afeld that role looks great.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
@akcrisp And the problem with your implementation is that it fails for anything but the simple direct-connection use-case. The wait_for_connection module used to do this as well, but it fails for ssh_proxy, or other proxied transport connections, so we had to remove it.
You can make the delay-time configurable per system/group or other characteristics, but that's not ideal.
Agreed but not clear how you fix this without ? A none deterministic finger in the air random delay ?
Sent from my iPhone
On 12 Dec 2017, at 16:20, Dag Wieers notifications@github.com wrote:
@akcrisp And the problem with your implementation is that it fails for anything but the simple direct-connection use-case. The wait_for_connection module used to do this as well, but it fails for ssh_proxy, or other proxied transport connections, so we had to remove it.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
Isn't it possible to add ssh check to wait_for connection
Plus add grace_timeout kinda thing like in AWS Autoscaling Groups - wait some more time after "connection" is established
this worked well for me:
_ansible: 2.4.1.0
Ubuntu: 16.04.3 LTS
Linux 4.4.0-98-generic_
- name: reboot server
become: yes
shell: sleep 2 && /sbin/shutdown -r now "System reboot"
async: 1
poll: 0
- name: Wait for restart
local_action: wait_for port=22 host="{{ ansible_ssh_host | default(inventory_hostname) }}" search_regex=OpenSSH delay=60
- name: continue running script after reboot
shell: 'sh /home/ubuntu/my_script.sh'
This worked for me on Ansible 2.4.2.0 and Ubuntu 16.04 LTS on Azure
- hosts: all
become: yes
become_user: root
pre_tasks:
- name: Patching for Spectre and Meltdown followed by a reboot
become: yes
shell: nohup bash -c 'sleep 2 && apt -y update && apt -y upgrade && apt -y autoremove && reboot "System reboot"' &
async: 1
poll: 0
- name: Wait for 3 minutes for server to come online
become: false
local_action: wait_for port=22 host={{ ansible_ssh_host | default(inventory_hostname) }} search_regex='OpenSSH' delay=180 timeout=300
I guess my use case was much more complex. Here's mine written as a handler:
- name: Inform of reboot required
listen: reboot machine
debug:
msg: "System {{ inventory_hostname }} needs to be rebooted for changes to take effect"
- name: Update GRUB to pick up changes to default config, if any
command: update-grub2
listen: reboot machine
# Send the reboot command and let it run in the background
# so we can disconnect...
- name: Send reboot command
listen: reboot machine
shell: '(sleep 5; shutdown -r now) &'
- name: Clear host errors
listen: reboot machine
meta: clear_host_errors
failed_when: false
- name: Reset connection
listen: reboot machine
meta: reset_connection
failed_when: false
- name: Wait for SSH to be available
listen: reboot machine
local_action: wait_for
args:
host: "{{ ansible_host }}"
port: "{{ ansible_port | default('22') }}"
delay: 60
state: started
- name: Ansible ping
listen: reboot machine
local_action: ping
register: result
until: result.ping is defined and result.ping == 'pong'
retries: 30
delay: 10
- name: Run uptime
listen: reboot machine
command: uptime
# LACP and spanning-tree take a bit of time to start working
- name: Ping default gateway
listen: reboot machine
command: "ping -c 1 {{ ansible_default_ipv4.gateway }}"
register: result
until: result.rc == 0
retries: 30
delay: 10
Here's my solution (Ansible 2.4.2):
- name: restart machine
shell: nohup sh -c '(sleep 5; shutdown -r now "Ansible restart") &' &>/dev/null
become: yes
- name: wait for machine to restart
wait_for_connection:
delay: 60
sleep: 5
timeout: 300
this worked for me:
- name: restart the system
shell: "sleep 5 & reboot"
async: 1
poll: 0
- name: wait for the system to reboot
wait_for_connection:
connect_timeout: 20
sleep: 5
delay: 5
timeout: 60
All these workarounds are interesting, but the real fix will be
We are working on a new reboot action plugin that will perform a reboot, will wait for the connection to start working again and finally checks if the system was actually rebooted.
Right? (from https://github.com/ansible/ansible/issues/14413#issuecomment-330523110)
Confirmed.
looking forward to it
I am interested to know whether any reboot module will support various Unix flavours beyond Linux ? Ie aix / Solaris etc, I assume it works with windows ?
The point I made with my example and most seem to have missed it - is that by simply having a time out of waiting for port 22 - it’s entirely possible to get a false positive - if a host takes longer to shutdown processes i.e. think large database - than the delay factor then it may well not have actually rebooted - tested and proved this can happen - hence my test to ensure ssh is absent first.
Andy
Sent from my iPhone
On 28 Feb 2018, at 15:04, Dag Wieers notifications@github.com wrote:
Confirmed.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
@akcrisp That is the intention. The discussion was linked here before: https://github.com/ansible/ansible/issues/16186
---
- hosts: all
- name: restart the system
shell: "sleep 5 & reboot"
async: 1
poll: 0
- name: wait for the system to reboot
wait_for_connection:
connect_timeout: 20
sleep: 5
delay: 5
timeout: 60
ansible-playbook test.yaml
ERROR! Syntax Error while loading YAML.
mapping values are not allowed in this context
The error appears to have been in '/etc/ansible/test.yaml': line 4, column 10, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
help please ;-)
---
- hosts: all
- name: restart the system
shell: "sleep 5 & reboot"
async: 1
poll: 0
- name: wait for the system to reboot
wait_for_connection:
connect_timeout: 20
sleep: 5
delay: 5
timeout: 60
Try this.
Grief - I wish people would read what I have done. Everyone just waiting for timeout risks a false positive. It would only take an app along time to shutdown and it will think ssh is up after reboot. I have tested it.
You are far better off checking ssh is absent first - that doesn’t rely on shh - uses a python socket connection
Sent from my iPhone
On 16 Apr 2018, at 09:21, Ben Abineri notifications@github.com wrote:
- hosts: all
name: restart the system
shell: "sleep 5 & reboot"
async: 1
poll: 0name: wait for the system to reboot
wait_for_connection:
connect_timeout: 20
sleep: 5
delay: 5
timeout: 60
Try this.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
You're right - my comment wasn't an endorsement of the design, I just wanted to demonstrate the correct formatting.
This related bug seems to be a cause of the async reboot task failing to run as @pyroxde noted earlier
So we now have a reboot and win_reboot action plugin to reboot Unix and Windows servers. If you have any issues with the existing implementation, feel free to open a new issue with any specifics.
Most helpful comment
An update of the docs and/or the support article to use the preferred full YAML format for tasks would also be nice. This works for me: