When I ran a Job a second time against a set of host I've just rebuilt with terraform, it fails due to the host keys being different, invoking a possible spoofing attack.
As said in the issue #387, host keys are ignored, thus the job execution should not fail for such a reason.
At the second execution, it fails with this output:
fatal: [node1.test]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\n@ WARNING: POSSIBLE DNS SPOOFING DETECTED! @\r\n@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\nThe ECDSA host key for node1.test.somedomain.com has changed,\r\nand the key for the corresponding IP address xxx.xxx.xxx.xxx \r\nis unknown. This could either mean that\r\nDNS SPOOFING is happening or the IP address for the host\r\nand its host key have changed at the same time.\r\n@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\n@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @\r\n@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\nIT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!\r\nSomeone could be eavesdropping on you right now (man-in-the-middle attack)!\r\nIt is also possible that a host key has just been changed.\r\nThe fingerprint for the ECDSA key sent by the …
I'm also facing this
docker exec -i -t awx_task bash
sed '/^node1.test/d' -i /root/.ssh/known_hosts
We hit the same thing, we did something like this for it in a bash script to fix it.
echo "remove fqdn ($fqdn) and ip ($ip) from known hosts"
sed -i '/^'$fqdn'/d' ~/.ssh/known_hosts
sed -i '/^'$ip'/d' ~/.ssh/known_hosts
ssh-keyscan $fqdn >> ~/.ssh/known_hosts
ssh-keyscan $ip >> ~/.ssh/known_hosts
It'd be nice if there was a way to invoke this locally against an inventory.
Giant pain in Ansible Tower. Anytime an IP is recycled, we've got manually clear it from the known_hosts.
This is my +1 to hopefully help the merge pull request here.
Maybe I'm missing something... I don't see a PR?
Is there one from ryanpetrello 452?.. I could be totally mis-reading GitHub here.. https://github.com/matburt/awx/commit/4510cd11dbd65eb86cd3f5235168e76c42506360
Unrelated PR tagged into issue due to naive github matching of the number "452".
Dang, alright, well my moral support is provided! A little more background - I (we) use openstack and when we terminate a server, then recycle the IP, this issue hits. Depends on what were doing is how frequent this hits. We try to create Ansible Playbooks for all new server components (expand disks, install certs, etc), that run nightly to make sure things are up to date.
Also experiencing this in a VMWare private cloud. Does Tower not take in to account the project's ansible.cfg? I ask because I have the following in it:
[ssh_connection]
ssh_args = -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no
I was under the impression StrictHostKeyChecking=no means it doesn't matter what the host key is, so ip's can be recycled?
I think it should be combined with another option:
-o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no
As a workaround, I mount /dev/null like this in my docker-compose for the awx_task container:
volumes:
- /dev/null:/root/.ssh/known_hosts
ok will try adding -o UserKnownHostsFile=/dev/null as well @notuscloud
I face the same issue. I have following in ansible.cfg on awx_task container:
host_key_checking = False
And it correctly translates into ssh connection parameters, Ansible opens with the target host and yet I get host key changed error:
ansible-playbook 2.5.4
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/var/lib/awx/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/site-packages/ansible
executable location = /usr/bin/ansible-playbook
python version = 2.7.5 (default, Apr 11 2018, 07:36:10) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]
Using /etc/ansible/ansible.cfg as config file
SSH password:
PLAYBOOK: vncplaybook.yml ******************************************************
1 plays in vncplaybook.yml
PLAY [vnc playbook] ************************************************************
TASK [Gathering Facts] *********************************************************
task path: /var/lib/awx/projects/_308__foobar_project_3454f7c5_c305_444e_9a4e_c87483ce00461eaf2218_95ca_4438_9748_da07b125ab65/vncplaybook.yml:2
<10.31.127.84> ssh_retry: attempt: 0, ssh return code is 255. cmd (['sshpass', '-d14', 'ssh', '-C', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'StrictHostKeyChecking=no', '-o', 'User=foobar', '-o', 'ConnectTimeout=10', '-o', 'ControlPath=/tmp/awx_462__Pvzgp/cp/af797df922', '10.31.127.84', "/bin/sh -c 'echo ~foobar && sleep 0'"]...), pausing for 0 seconds
<10.31.127.84> ssh_retry: attempt: 1, ssh return code is 255. cmd (['sshpass', '-d14', 'ssh', '-C', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'StrictHostKeyChecking=no', '-o', 'User=foobar', '-o', 'ConnectTimeout=10', '-o', 'ControlPath=/tmp/awx_462__Pvzgp/cp/af797df922', '10.31.127.84', "/bin/sh -c 'echo ~foobar && sleep 0'"]...), pausing for 1 seconds
<10.31.127.84> ssh_retry: attempt: 2, ssh return code is 255. cmd (['sshpass', '-d14', 'ssh', '-C', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'StrictHostKeyChecking=no', '-o', 'User=foobar', '-o', 'ConnectTimeout=10', '-o', 'ControlPath=/tmp/awx_462__Pvzgp/cp/af797df922', '10.31.127.84', "/bin/sh -c 'echo ~foobar && sleep 0'"]...), pausing for 3 seconds
fatal: [10.31.127.84]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\n@ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @\r\n@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@\r\nIT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!\r\nSomeone could be eavesdropping on you right now (man-in-the-middle attack)!\r\nIt is also possible that a host key has just been changed.\r\nThe fingerprint for the ECDSA key sent by the remote host is\nSHA256:DuE8GJ41siGMVl+jRHXRXPVYd6PsLkBrxKeno432t/w.\r\nPlease contact your system administrator.\r\nAdd correct host key in /root/.ssh/known_hosts to get rid of this message.\r\nOffending ECDSA key in /root/.ssh/known_hosts:6\r\nPassword authentication is disabled to avoid man-in-the-middle attacks.\r\nKeyboard-interactive authentication is disabled to avoid man-in-the-middle attacks.\r\nPermission denied (publickey,password).\r\n", "unreachable": true}
PLAY RECAP *********************************************************************
10.31.127.84 : ok=0 changed=0 unreachable=1 failed=0
+1 on this issue. AWX doesn't seem to be respecting the ansible.cfg or the environment variable ANSIBLE_HOST_KEY_CHECKING set in the UI.
I worked around this by setting "ANSIBLE_SSH_ARGS": "-C -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no" in the environment variables in the AWX GUI.
+1 on this issue
I worked around this by setting
"ANSIBLE_SSH_ARGS": "-C -o ControlMaster=auto -o ControlPersist=60s -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no"in the environment variables in the AWX GUI.
Thanks to @sudomateo, your workaround did the job !
It should at least remove the ssh fingertip from the known_hosts file when removing the host from the GUI.
@sudomateo hint worked, thanks.
You probably don't want to disable the host key checking "tower wide".
One approach is to overwrite the ansible.cfg variable in each inventory that contains nodes that may change their SSH key over time.
---
defaults:
vars:
host_key_checking: false
None of the suggestions above worked for me while running AWX 9.2, Ansible 2.9, ansible.cfg file config is not being ignored if you run it in verbose mode, I saw that StrictHostKeyChecking=no was set, while running the job in -vvv verbose mode but still got the SSH key being changed error.
So that didn't work. So had to add the following to the inventory file and it worked.
[all:vars]
ansible_ssh_common_args='-o StrictHostKeyChecking=no -o userknownhostsfile=/dev/null'
I got this solution from
https://route1.ph/2020/01/14/disable-strict-host-key-checking-in-ansible/
Not the most efficient, but I've created a playbook that I can execute from AWX that will read through my inventory group and using the shell module it will remove the line items from /root/.ssh/known_hosts
Again, it's not efficient because if you have a ton of hosts in your inventory_group, then this run at a linear time.
- hosts: localhost
become: yes
become_method: sudo
tasks:
- name: Remove DHCP addresses that are present in the known_hosts from the docker container awx_task
shell: |
docker exec -it awx_task /bin/bash -c "sed '/^{{ item }}/d' -i /root/.ssh/known_hosts"
with_inventory_hostnames:
- awx_host_group
ignore_errors: yes
delegate_to: awx_server.your_domain.com
Welcome to hear suggestions and feedback on how this can be improved!
Most helpful comment
Giant pain in Ansible Tower. Anytime an IP is recycled, we've got manually clear it from the known_hosts.
This is my +1 to hopefully help the merge pull request here.