Can't connect to a minion via salt-ssh
, although plain ssh
works just fine. It looks like salt does reverse IP lookup and then uses this new address to connect via ssh. The machine is located behind a NAT and the connection is established via port-forwarding.
Related issues:
% salt-ssh --user pi --sudo -i minion.example.com test.ping
11-22-33-44.provider.com:
ssh: Could not resolve hostname 11-22-33-44.provider.com: nodename nor servname provided, or not known
The same with IP address:
% salt-ssh --user pi --sudo -i 11.22.33.44 test.ping
11-22-33-44.provider.com:
ssh: Could not resolve hostname 11-22-33-44.provider.com: nodename nor servname provided, or not known
DNS info:
dig +short minion.example.com
11.22.33.44
dig +short -x 11.22.33.44
11.22.33.44.provider.com.
dig +short 11.22.33.44.provider.com.
Note the last command. It doesn't return anything, and that is the reason why it fails.
And also the second command dig +short -x 11.22.33.44
could return any garbage (it is set up by the hosting provider, not by the owner of example.com). Salt shouldn't trust that data to connect to hosts.
The log below shows that salt-ssh uses reverse hostname to connect and not the one I specified on the command line. I feel this could be potentially bad security-wise.
[TRACE ] Terminal Command: /bin/sh -c 11-22-33-44.provider.com -o KbdInteractiveAuthentication=no -o PasswordAuthentication=no -o GSSAPIAuthentication=no -o ConnectTimeout=65 -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o Port=22 -o IdentityFile=agent-forwarding -o User=pi /bin/sh << 'EOF'
salt --versions-report
Salt Version:
Salt: 2018.3.2
Dependency Versions:
cffi: 1.11.5
cherrypy: Not Installed
dateutil: Not Installed
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
Jinja2: 2.10
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.5.6
mysql-python: Not Installed
pycparser: 2.18
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.6.6 (default, Jun 28 2018, 05:43:53)
python-gnupg: Not Installed
PyYAML: 3.13
PyZMQ: 17.1.2
RAET: Not Installed
smmap: Not Installed
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.2.5
System Versions:
dist:
locale: UTF-8
machine: x86_64
release: 17.7.0
system: Darwin
version: 10.13.6 x86_64
If the reverse DNS doesn't exist at all, then it will fail even worse:
[DEBUG ] salt.utils.network.ip_to_host('192.168.129.223') failed: [Errno 1] Unknown host
[DEBUG ] LazyLoaded local_cache.prep_jid
[DEBUG ] Adding minions for job 20180930153629106417: [None]
[ERROR ] An un-handled exception from the multiprocessing process 'MultiprocessingProcess-1' was caught:
Traceback (most recent call last):
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/utils/process.py", line 747, in _run
return self._original_run()
File "/Users/xistence/.pyenv/versions/3.6.6/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/__init__.py", line 518, in handle_routine
**target)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/__init__.py", line 920, in __init__
self.shell = salt.client.ssh.shell.gen_shell(opts, **args)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/shell.py", line 63, in gen_shell
shell = Shell(opts, **kwargs)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/shell.py", line 90, in __init__
self.host = host.strip('[]')
AttributeError: 'NoneType' object has no attribute 'strip'
Process MultiprocessingProcess-1:
Traceback (most recent call last):
File "/Users/xistence/.pyenv/versions/3.6.6/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/utils/process.py", line 747, in _run
return self._original_run()
File "/Users/xistence/.pyenv/versions/3.6.6/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/__init__.py", line 518, in handle_routine
**target)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/__init__.py", line 920, in __init__
self.shell = salt.client.ssh.shell.gen_shell(opts, **args)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/shell.py", line 63, in gen_shell
shell = Shell(opts, **kwargs)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/shell.py", line 90, in __init__
self.host = host.strip('[]')
AttributeError: 'NoneType' object has no attribute 'strip'
[ERROR ] Target 'None' did not return any data, probably due to an error.
[ERROR ] An un-handled exception was caught by salt's global exception handler:
TypeError: join() argument must be str or bytes, not 'NoneType'
Traceback (most recent call last):
File "/Users/xistence/.ve/salt/bin/salt-ssh", line 11, in <module>
load_entry_point('salt==2018.3.2', 'console_scripts', 'salt-ssh')()
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/scripts.py", line 425, in salt_ssh
client.run()
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/cli/ssh.py", line 24, in run
ssh.run()
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/__init__.py", line 771, in run
self.cache_job(jid, host, ret[host], fun)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/__init__.py", line 697, in cache_job
'fun': fun})
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/returners/local_cache.py", line 144, in returner
hn_dir = os.path.join(jid_dir, load['id'])
File "/Users/xistence/.pyenv/versions/3.6.6/lib/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/Users/xistence/.pyenv/versions/3.6.6/lib/python3.6/genericpath.py", line 149, in _check_arg_types
(funcname, s.__class__.__name__)) from None
TypeError: join() argument must be str or bytes, not 'NoneType'
Traceback (most recent call last):
File "/Users/xistence/.ve/salt/bin/salt-ssh", line 11, in <module>
load_entry_point('salt==2018.3.2', 'console_scripts', 'salt-ssh')()
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/scripts.py", line 425, in salt_ssh
client.run()
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/cli/ssh.py", line 24, in run
ssh.run()
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/__init__.py", line 771, in run
self.cache_job(jid, host, ret[host], fun)
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/client/ssh/__init__.py", line 697, in cache_job
'fun': fun})
File "/Users/xistence/.ve/salt/lib/python3.6/site-packages/salt/returners/local_cache.py", line 144, in returner
hn_dir = os.path.join(jid_dir, load['id'])
File "/Users/xistence/.pyenv/versions/3.6.6/lib/python3.6/posixpath.py", line 94, in join
genericpath._check_arg_types('join', a, *p)
File "/Users/xistence/.pyenv/versions/3.6.6/lib/python3.6/genericpath.py", line 149, in _check_arg_types
(funcname, s.__class__.__name__)) from None
TypeError: join() argument must be str or bytes, not 'NoneType'
Salt Version:
Salt: 2018.3.2
Dependency Versions:
cffi: Not Installed
cherrypy: Not Installed
dateutil: Not Installed
docker-py: Not Installed
gitdb: Not Installed
gitpython: Not Installed
ioflo: Not Installed
Jinja2: 2.10
libgit2: Not Installed
libnacl: Not Installed
M2Crypto: Not Installed
Mako: Not Installed
msgpack-pure: Not Installed
msgpack-python: 0.5.6
mysql-python: Not Installed
pycparser: Not Installed
pycrypto: 2.6.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.6.6 (default, Sep 24 2018, 19:43:35)
python-gnupg: Not Installed
PyYAML: 3.13
PyZMQ: 17.1.2
RAET: Not Installed
smmap: Not Installed
timelib: Not Installed
Tornado: 4.5.3
ZMQ: 4.2.5
System Versions:
dist:
locale: UTF-8
machine: x86_64
release: 18.0.0
system: Darwin
version: 10.14 x86_64
This is fixed in 2018.3.3 https://github.com/saltstack/salt/pull/48771
@saltstack/team-ssh can yall take a look at this?
Thanks,
Daniel
Are we really requiring that the reverse dns needs to be set on ips?
Slack notes:
marnold
`11.22.33.44` is the gateway host, it could have anything in its reverse DNS. For example (if I remember correctly), on Leaseweb it contains an invalid hostname
msmith
i'd hope it's the latter. accepting the need for rdns will completely prevent it from being used in cases where dns isn't available
i'd go so far as to say if it's an ip address, actually use that for the minion id and don't try _any_ lookup
marnold
Yep!
msmith
for salt-ssh, there's no open connection anyway, it doesn't need to be addressable after the call
marnold
The only sane policy (imho) is just always use what specified on the command line. Otherwise it's a security risk (salt shouldn't trust random DNS servers) (edited)
msmith
the whole reason to not use the roster is that the (ssh) minions can then be dynamic (edited)
marnold
Agree. Roster is a pain to set up. Salt should just connect to the host is was told to on the command line
salt-ssh is often used to bootstrap new machines, after which they change the IP address (edited)
@gtmanfred @rossengeorgiev Status here? I am fighting with this as of today.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
If this issue is closed prematurely, please leave a comment and we will gladly reopen the issue.
🤷♂️
Thank you for updating this issue. It is no longer marked as stale.
The salt-ssh behavior is definitely broken. Could someone from SaltStack make a design decision about that and finally fix it?
We brought this up in the salt-ssh work group meeting. It was decided that we should not included this behavior. Im pretty sure this is the line https://github.com/saltstack/salt/blob/v2019.2.2/salt/client/ssh/__init__.py#L409 causing this behavior, but will need to dive into it more and write some tests. Have add this to my backlog.
To add on (why, why, why) I now discover that with the latest version of salt-ssh on the mac, whereas in the past I just needed to have a host be added to /etc/salt/roster
, now I need to have a reverse entry for it in /etc/hosts
as well??
Without an entry in /etc/hosts
:
# salt-ssh vm-aa test.ping
[ERROR ] Target 'vm-aa' did not return any data, probably due to an error.
vm-aa:
Target 'vm-aa' did not return any data, probably due to an error.
Salt Version:
Salt: 3000.1
Dependency Versions:
cffi: 1.12.2
cherrypy: unknown
dateutil: 2.8.0
docker-py: Not Installed
gitdb: 2.0.6
gitpython: 2.1.15
Jinja2: 2.10.1
libgit2: Not Installed
M2Crypto: Not Installed
Mako: 1.0.7
msgpack-pure: Not Installed
msgpack-python: 0.5.6
mysql-python: Not Installed
pycparser: 2.19
pycrypto: 3.8.1
pycryptodome: Not Installed
pygit2: Not Installed
Python: 3.5.4 (default, Mar 27 2020, 15:24:03)
python-gnupg: 0.4.4
PyYAML: 5.1.2
PyZMQ: 18.0.1
smmap: 3.0.1
timelib: 0.2.4
Tornado: 4.5.3
ZMQ: 4.3.1
System Versions:
dist:
locale: UTF-8
machine: x86_64
release: 19.3.0
system: Darwin
version: 10.15.3 x86_64
@Ch3LL I think the problem might be somewhere else. At least on my system.
If I insert a log.error(running)
in the handle_ssh
method, after the for host in running:
line (this would be https://github.com/saltstack/salt/blob/v2019.2.2/salt/client/ssh/__init__.py#L607 if you're still looking at that version; but my version is now as per detailed above, and for me I am editing /opt/salt/lib/python3.5/site-packages/salt-3000.1-py3.5.egg/salt/client/ssh/__init__.py
), I can see that I only loop through things twice:
[ERROR ] {'vm-aa': {'thread': <Process(Process-1, started)>}}
[ERROR ] {'vm-aa': {'thread': <Process(Process-1, stopped[SIGSEGV])>}}
It looks like a problem in Single (?), where the thread results in a SIGSEGV. If I have an entry in /etc/hosts, I do not get the SIGSEGV and salt-ssh runs successfully
hmm interesting. And if you remove the line I referenced that doesn't do anything as well?
that if
(if salt.utils.network.is_reachable_host(hostname):
) doesnt even get triggered, so that's moot. This is for both cases when I have an entry for the hostname in /etc/hosts
, and for when I dont: that if
just doesnt get triggered.
k thanks for trying that. Will be good to know when i dive into the issue. This is still on my backlog so if you have any luck please feel free to push PR.
ping @jf can you try https://github.com/saltstack/salt/pull/58163
There was actually two places that called salt.utils.network.ip_to_host
so im guessing you were hitting the other reference i didnt mention above.