Paramiko: Executing a command sometimes hangs forever

Created on 24 Apr 2015  路  31Comments  路  Source: paramiko/paramiko

I use paramiko to create SSH sessions and send commands to Linux machines, which works great 95% of the time. But sometimes, the Python thread which is used to execute the command hangs forever in wait() of threading.py:

(gdb) py-list
 239            waiter.acquire()
 240            self.__waiters.append(waiter)
 241            saved_state = self._release_save()
 242            try:    # restore state no matter what (e.g., KeyboardInterrupt) 
 243                if timeout is None:
>244                    waiter.acquire()
 245                    if __debug__:
 246                        self._note("%s.wait(): got it", self)
 247                else:
 248                    # Balancing act:  We can't afford a pure busy loop, so we
 249                    # have to sleep; but if we sleep the whole timeout time,


(gdb) py-bt
#5 Frame 0xb459a5c, for file /usr/lib/python2.7/threading.py, line 244, in wait (self=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e3a0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e3a0>, _Condition__waiters=[<thread.lock at remote 0xaf7e100>], release=<built-in method release of thread.lock object at remote 0xaf7e3a0>) at remote 0xb73708c>, timeout=None, waiter=<thread.lock at remote 0xaf7e100>, saved_state=None)
    waiter.acquire()
#9 Frame 0xb66118c, for file /usr/lib/python2.7/dist-packages/paramiko/buffered_pipe.py, line 137, in read (self=<BufferedPipe(_closed=False, _lock=<thread.lock at remote 0xaf7e3a0>, _buffer=<array.array at remote 0xb737ea0>, _event=None, _cv=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e3a0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e3a0>, _Condition__waiters=[<thread.lock at remote 0xaf7e100>], release=<built-in method release of thread.lock object at remote 0xaf7e3a0>) at remote 0xb73708c>) at remote 0xb737a4c>, nbytes=8192, timeout=None, out='', then=<float at remote 0xb57efe4>)
    self._cv.wait(timeout)
#13 Frame 0xb612b4c, for file /usr/lib/python2.7/dist-packages/paramiko/channel.py, line 584, in recv (self=<Channel(origin_addr=None, in_window_size=65536, in_window_threshold=6553, lock=<thread.lock at remote 0xaf7e5a0>, _pipe=None, eof_received=0, in_max_packet_size=34816, out_buffer_cv=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e5a0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e5a0>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0xaf7e5a0>) at remote 0xb73790c>, event=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e4e0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e4e0>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0xaf7e4e0>) at remote 0xb7370ac>) at remote 0xb73732c>, transport=<Transport(_Thread__ident=-1258251408, host_key_type='ss...(truncated)
    out = self.in_buffer.read(nbytes, self.timeout)
#16 Frame 0xa1875f4, for file /usr/lib/python2.7/dist-packages/paramiko/channel.py, line 1215, in _read (self=<ChannelFile(_wbuffer=<cStringIO.StringO at remote 0xb737860>, _closed=False, _bufsize=8192, _pos=105, _at_trailing_cr=False, _size=0, _rbuffer='', _flags=17, _realpos=105, channel=<Channel(origin_addr=None, in_window_size=65536, in_window_threshold=6553, lock=<thread.lock at remote 0xaf7e5a0>, _pipe=None, eof_received=0, in_max_packet_size=34816, out_buffer_cv=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e5a0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e5a0>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0xaf7e5a0>) at remote 0xb73790c>, event=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e4e0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e4e0>, _Condition__waiters=[], rele...(truncated)
    return self.channel.recv(size)
#19 Frame 0xb8eb9dc, for file /usr/lib/python2.7/dist-packages/paramiko/file.py, line 214, in readline (self=<ChannelFile(_wbuffer=<cStringIO.StringO at remote 0xb737860>, _closed=False, _bufsize=8192, _pos=105, _at_trailing_cr=False, _size=0, _rbuffer='', _flags=17, _realpos=105, channel=<Channel(origin_addr=None, in_window_size=65536, in_window_threshold=6553, lock=<thread.lock at remote 0xaf7e5a0>, _pipe=None, eof_received=0, in_max_packet_size=34816, out_buffer_cv=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e5a0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e5a0>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0xaf7e5a0>) at remote 0xb73790c>, event=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e4e0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e4e0>, _Condition__waiters=[], relea...(truncated)
    new_data = self._read(n)
#23 Frame 0xb4597f4, for file /usr/lib/python2.7/dist-packages/paramiko/file.py, line 106, in next (self=<ChannelFile(_wbuffer=<cStringIO.StringO at remote 0xb737860>, _closed=False, _bufsize=8192, _pos=105, _at_trailing_cr=False, _size=0, _rbuffer='', _flags=17, _realpos=105, channel=<Channel(origin_addr=None, in_window_size=65536, in_window_threshold=6553, lock=<thread.lock at remote 0xaf7e5a0>, _pipe=None, eof_received=0, in_max_packet_size=34816, out_buffer_cv=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e5a0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e5a0>, _Condition__waiters=[], release=<built-in method release of thread.lock object at remote 0xaf7e5a0>) at remote 0xb73790c>, event=<_Event(_Verbose__verbose=False, _Event__flag=True, _Event__cond=<_Condition(_Verbose__verbose=False, _Condition__lock=<thread.lock at remote 0xaf7e4e0>, acquire=<built-in method acquire of thread.lock object at remote 0xaf7e4e0>, _Condition__waiters=[], release=<...(truncated)
    line = self.readline()

I have no idea whether this is an issue in paramiko or Python threading, so I also reported it there: http://bugs.python.org/issue24026

Bug Hangs Needs investigation

Most helpful comment

We encountered the exact same issue as @DrPyser, with the same stacktrace hanging on waiter.acquire() when reading write responses.

We nailed it down to the default implementation of paramiko.buffered_pipe.BufferedPipe.read() (see http://docs.paramiko.org/en/stable/api/buffered_pipe.html#paramiko.buffered_pipe.BufferedPipe.read) - if you look closely at the documentation of the timeout parameter, it says "maximum seconds to wait (or None, the default, to wait forever)"

We explicitly set the socket channel timeout on the channel in a way similar to this:

host = '127.0.0.1' # Your host here
sftp_port = 22 # Your port here
user = 'user' # Your username here
password = 'password' # You password here
connection_timeout = 10 # Your timeout in seconds here

connection_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connection_socket.settimeout(connection_timeout) # <-- We were doing this, but it's overwritten when creating Transport
connection_socket.connect((host, sftp_port))
transport = paramiko.Transport(sock=connection_socket)
transport.connect(username=user, password=password)
session = paramiko.SFTPClient.from_transport(transport)

# RELEVANT CHANGE - Set the timeout again, otherwise it is not used in file tranfers
session.get_channel().settimeout(connection_timeout)

By doing this, any subsequent call to session.put() will throw a PipeTimeout exception instead of waiting forever.

All 31 comments

Also, I'm using the following versions:

Python 2.7.5
Paramiko 1.15.2

i get the same problom銆俢an you sove it?

This seems to be happening more frequently. I've seen it happen five times in the past three days. We use paramiko to monitor our servers and when it hangs, we can't monitor our servers.

it would be really helpfull if you could try an earlier version of paramiko, preferably a version before the python3 port and see if you get the same result.

My python version:
[root@wu2 ~]# python
Python 2.6.6 (r266:84292, Feb 22 2013, 00:00:18)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

import Paramiko
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named Paramiko
import paramiko
paramiko.version
'1.7.5 (Ernest)'

This affects me through duplicity, which uses paramiko for sftp by default.

I am using Python 2.7.9 (Debian Jessie), and it hangs in the similar place:

> /usr/lib/python2.7/threading.py(340)wait()
    339             if timeout is None:
--> 340                 waiter.acquire()
    341                 if __debug__:

This happens for writing data in my case, and the last line from paramiko is:

> /usr/lib/python2.7/dist-packages/paramiko/channel.py(1177)_wait_for_send_window()
   1176                 then = time.time()
-> 1177                 self.out_buffer_cv.wait(timeout)
   1178                 if timeout is not None:

I've added some print to Channel._send, and it appears to happen for the
last packet, if there are several:

('wait:', 32809)
('wait:', 28777)
('wait:', 24745)
('wait:', 20713)
('wait:', 16681)
('wait:', 12649)
('wait:', 8617)
('wait:', 4585)
('wait:', 1479)
> /usr/lib/python2.7/dist-packages/paramiko/channel.py(1062)_send()
   1061                 import ipdb; ipdb.set_trace()
-> 1062             size = self._wait_for_send_window(size)
   1063             if size == 0:

@stephencpope are you sure #577/#578 addresses the described problem as your pull request #578 only seems to touch sftp code and the initial issue described here is exec_command related?

no I am not. I made no effort to recreate the problem discussed here. And looking at the o.p. now I can see it is different code.

I believe that I've hit this bug as well, Python 3.5.2, paramiko 2.0.1. I'm new at Python, so I'm still wrapping my head around everything. ;)

The last line of my script that is executing is this:

output = ssh_shell.recv(500)

My script hangs consistently at this spot when talking to this particular model device, 100% reproducible in my case. This particular device has a poor SSH implementation (Bug 787), and the following workaround code is being used to establish the connection:

class patched_SSHClient(paramiko.SSHClient):
    def _auth(self, username, password, *args):
        if not password:
            try:
                self._transport.auth_none(username)
                return
            except paramiko.BadAuthenticationType:
                pass
        paramiko.SSHClient._auth(self, username, password, *args)

At the time of the hang, my script is checking to see if I'm at this unit's command prompt, verifying that the last sent command completed, before the next command is sent.

pdb output from the hang follows:

> c:\users\wrightj\python\new_ssh.py(27)checkPrompt()
-> output = ssh_shell.recv(500)
(Pdb)
--Call--
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\channel.py(599)recv()
-> def recv(self, nbytes):
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\channel.py(612)recv()
-> try:
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\channel.py(613)recv()
-> out = self.in_buffer.read(nbytes, self.timeout)
(Pdb)
--Call--
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(122)read()
-> def read(self, nbytes, timeout=None):
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(143)read()
-> out = bytes()
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(144)read()
-> self._lock.acquire()
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(145)read()
-> try:
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(146)read()
-> if len(self._buffer) == 0:
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(147)read()
-> if self._closed:
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(150)read()
-> if timeout == 0.0:
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(154)read()
-> while (len(self._buffer) == 0) and not self._closed:
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(155)read()
-> then = time.time()
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\site-packages\paramiko\buffered_pipe.py(156)read()
-> self._cv.wait(timeout)
(Pdb)
--Call--
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(261)wait()
-> def wait(self, timeout=None):
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(284)wait()
-> if not self._is_owned():
(Pdb)
--Call--
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(252)_is_owned()
-> def _is_owned(self):
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(255)_is_owned()
-> if self._lock.acquire(0):
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(259)_is_owned()
-> return True
(Pdb)
--Return--
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(259)_is_owned()->True
-> return True
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(286)wait()
-> waiter = _allocate_lock()
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(287)wait()
-> waiter.acquire()
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(288)wait()
-> self._waiters.append(waiter)
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(289)wait()
-> saved_state = self._release_save()
(Pdb)
--Call--
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(246)_release_save()
-> def _release_save(self):
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(247)_release_save()
-> self._lock.release()           # No state to save
(Pdb)
--Return--
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(247)_release_save()->None
-> self._lock.release()           # No state to save
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(290)wait()
-> gotit = False
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(291)wait()
-> try:    # restore state no matter what (e.g., KeyboardInterrupt)
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(292)wait()
-> if timeout is None:
(Pdb)
> c:\users\wrightj\appdata\local\programs\python\python35-32\lib\threading.py(293)wait()
-> waiter.acquire()
(Pdb)

Some additional troubleshooting. My current code where the hang happens is this:

def checkPrompt():
    if (ssh_shell.recv_ready):
        print("we have data")
    else:
        print("we have NO DATA")
    print("Timeout is %s" % ssh_shell.gettimeout())
    output = ssh_shell.recv(50)
    output = output.decode("utf-8")
    return output.endswith('#')

I'm able to verify that my ssh_shell has data ready to be read. I can verify that the ssh_shell timeout is indeed None, as indicated by my earlier debug above. As soon as the line for ssh_shell.recv is called, it goes into the hang as indicated above.

My recv value was originally 500, dropping it to 50 didn't have an impact. I tried setting a timeout prior to calling that statement with ssh_shell.settimeout(3), but it only threw an exception at the end.

we have data
Traceback (most recent call last):
  File "C:\Users\wrightj\AppData\Local\Programs\Python\Python35-32\lib\site-packages\paramiko\channel.py", line 613, in recv
    out = self.in_buffer.read(nbytes, self.timeout)
  File "C:\Users\wrightj\AppData\Local\Programs\Python\Python35-32\lib\site-packages\paramiko\buffered_pipe.py", line 160, in read
    raise PipeTimeout()
paramiko.buffered_pipe.PipeTimeout

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "new_ssh.py", line 176, in <module>
    while (not checkPrompt()):
  File "new_ssh.py", line 32, in checkPrompt
    output = ssh_shell.recv(50)
  File "C:\Users\wrightj\AppData\Local\Programs\Python\Python35-32\lib\site-packages\paramiko\channel.py", line 615, in recv
    raise socket.timeout()
socket.timeout

I suppose I could trap for this error and just keep trying, but if the data never gets read, there's really no point. Something is keeping the data from being read causing the timeout in the first place.

while not ch.exit_status_ready():
if ch.recv_ready():
stdout.write(ch.recv(block_size))
if ch.recv_stderr_ready():
stderr.write(ch.recv_stderr(block_size))
while ch.recv_ready():
stdout.write(ch.recv(block_size))
while ch.recv_stderr_ready():
stderr.write(ch.recv_stderr(block_size))

I ran into this issue today (Python 3.6.3 + paramiko 2.3.1) as well, is there any update?

Same issue here

File "/usr/local/lib/python2.7/dist-packages/paramiko/file.py", line 103, in next
line = self.readline()
File "/usr/local/lib/python2.7/dist-packages/paramiko/file.py", line 288, in readline
new_data = self._read(n)
File "/usr/local/lib/python2.7/dist-packages/paramiko/channel.py", line 1330, in _read
return self.channel.recv(size)
File "/usr/local/lib/python2.7/dist-packages/paramiko/channel.py", line 683, in recv
out = self.in_buffer.read(nbytes, self.timeout)
File "/usr/local/lib/python2.7/dist-packages/paramiko/buffered_pipe.py", line 156, in read
self._cv.wait(timeout)
File "/usr/lib/python2.7/threading.py", line 339, in wait
waiter.acquire()

Same issue here too :(

ssh.connect('192.168.aa.bb',port=zz,username='xxxxx',password='yyyyy')
stdin,stdout,stderr = ssh.exec_command('show interface all')
output=stdout.readlines()
^CTraceback (most recent call last):
File "", line 1, in
File "/home/kprakash/.local/lib/python3.5/site-packages/paramiko/file.py", line 346, in readlines
line = self.readline()
File "/home/kprakash/.local/lib/python3.5/site-packages/paramiko/file.py", line 288, in readline
new_data = self._read(n)
File "/home/kprakash/.local/lib/python3.5/site-packages/paramiko/channel.py", line 1332, in _read
return self.channel.recv(size)
File "/home/kprakash/.local/lib/python3.5/site-packages/paramiko/channel.py", line 683, in recv
out = self.in_buffer.read(nbytes, self.timeout)
File "/home/kprakash/.local/lib/python3.5/site-packages/paramiko/buffered_pipe.py", line 156, in read
self._cv.wait(timeout)
File "/usr/lib/python3.5/threading.py", line 293, in wait
waiter.acquire()

I think i'm also running into this, if a server goes away/offline/etc when the ssh.exec_command is in progress, we never recover.
How can I trigger a trace when it's 'stuck' to confirm the same issue as above?

Interestingly, it seems to get stuck more on MacOSX than on Linux (ubuntu on docker) with the same versions of Python3 and the paramiko module. MacOSX version hangs even when the server hasn't gone anywhere.

Thanks.

We are experiencing same issue, on sftp.put(), when file is greather than 20KB aprox.

This is the full Traceback when script cancelled (CTRL-C). It gets blocked forever at self._cv.wait(timeout) on buffered_pipe.py :

Traceback (most recent call last):
  File "test_ftp.py", line 32, in <module>
    sftp.put("test.txt", 'test1b.txt')
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/sftp_client.py", line 727, in put
    return self.putfo(fl, remotepath, file_size, callback, confirm)
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/sftp_client.py", line 686, in putfo
    reader=fl, writer=fr, file_size=file_size, callback=callback
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/util.py", line 300, in __exit__
    self.close()
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/sftp_file.py", line 74, in close
    self._close(async_=False)
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/sftp_file.py", line 96, in _close
    self.sftp._request(CMD_CLOSE, self.handle)
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/sftp_client.py", line 780, in _request
    return self._read_response(num)
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/sftp_client.py", line 810, in _read_response
    t, data = self._read_packet()
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/sftp.py", line 174, in _read_packet
    x = self._read_all(4)
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/sftp.py", line 158, in _read_all
    x = self.sock.recv(n)
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/channel.py", line 683, in recv
    out = self.in_buffer.read(nbytes, self.timeout)
  File "/var/pywww/entropy-galos/.virtualenvs/entropy-galos/lib/python3.5/site-packages/paramiko/buffered_pipe.py", line 158, in read
    self._cv.wait(timeout)
  File "/opt/python-3.5.4/lib/python3.5/threading.py", line 293, in wait
    waiter.acquire()
KeyboardInterrupt

The strace shows thread blocked on a futex lock:

[...]
poll([{fd=6, events=POLLOUT}], 1, 100)  = 1 ([{fd=6, revents=POLLOUT}])
sendto(6, "R|\337)\34?Ee%\27dxH\322\0f\304\241\373?yh\222\206\205}n\243\tV\247E"..., 32756, 0, NULL, 0) = 32756
poll([{fd=6, events=POLLOUT}], 1, 100)  = 1 ([{fd=6, revents=POLLOUT}])
sendto(6, "v\300\v~\216\276\2+\347\301\257t\236\350|+\317:\5\231^i\n\221\6\240\365\362\353\264_\350"..., 132, 0, NULL, 0) = 132
read(8, "C\376T\200\17\30:\234\26^\21\350\323\2<\256\255= \f\v\264^\350\20h\231\0n\1\4\1"..., 32768) = 6069
read(8, "", 24576)                      = 0
poll([{fd=6, events=POLLOUT}], 1, 100)  = 1 ([{fd=6, revents=POLLOUT}])
sendto(6, "ul\203\325\335\271\265U\23\236\f\307t\212Gr\324B\210\234d\26n\304\340\361\254O\301\254<\241"..., 6148, 0, NULL, 0) = 6148
read(8, "", 32768)                      = 0
poll([{fd=6, events=POLLOUT}], 1, 100)  = 1 ([{fd=6, revents=POLLOUT}])
sendto(6, "\220\32em\23\257\r \266\222\220]\341\356\333\244-\33\1k\230\372\355\376\260W\253\307\307z\360\252"..., 68, 0, NULL, 0) = 68
futex(0x251f230, FUTEX_WAIT_PRIVATE, 0, NULL

The same transfer works fine with the sftp or scp linux cmdline utils from openssh.
We have no access to the remote SFTP server logs.

I also captured network packets. The following image is an screenshot from wireshark's tcp flow analysis.
As you can see, we suddently stop receiving ACK from remote SFTP server, and client sends several retransmission packets (dark background). If remote server closed the connection, it should have sent an RST packet. Seems packets are lost.

sftp-flow-retrans

I also tried with different timeout, window_size and max_packet_size options, with no success.
As sftp cmdline client works with same files, it seems that the way paramiko does stuff is the cause of not receiving more ACK from server. Also, it should not block.
When analyzing sftp cmdline network packets, all works ok, and there are no packet retransmissions nor connection reset.

paramiko version: 2.4.0.
OS Version: RHEL 7.3
Python version: 3.5.4

This issue was opened long time ago, also seems related to #814
Also could be related #577, but that was closed on 2015.

I ended up having to hack my way around this. I was writing some automation for a hacking/CTF challenge and paramiko randomly blocking forever was causing me a serious problem (in code that should have been one long repeating loop).

I ended up using processes with a hard timeout to "catch" paramiko if it got stuck and let me carry on. Extract of the code at this gist for anyone interested, notice the decorator above the function with the timeout.

https://gist.github.com/matjohn2/2c1c4988e17112a34f310667e0ff6e7e

Hello,
I'm having similar issue on a script who does a lot of small action.
I open and close a couple hundred ssh session every time, and i for now never manage to get it to run until the end. it will alwase hang at some point.

when i force quite the script, my traceback is different from what i see up here, wich make me wonder if it's the same issue.

Traceback (most recent call last):
  File "./massUser.py", line 485, in <module>
    Requested_Server_list.append(build_server_from_dict(server))
  File "./massUser.py", line 449, in build_server_from_dict
    return make_server(fqdn, user, password)
  File "./massUser.py", line 408, in make_server
    server = Server(fqdn, cnt_user, cnt_passwd)
  File "./massUser.py", line 122, in __init__
    self.uname = self.getUname()
  File "./massUser.py", line 168, in getUname
    self.connect()
  File "./massUser.py", line 137, in connect
    self.ssh_socket.connect(self.fqdn, port=22, username=self.cnt_user, password=self.cnt_passwd, timeout=5, auth_timeout=5)
  File "/usr/lib/python3.6/site-packages/paramiko/client.py", line 397, in connect
    t.start_client(timeout=timeout)
  File "/usr/lib/python3.6/site-packages/paramiko/transport.py", line 580, in start_client
    self.start()
  File "/usr/lib/python3.6/threading.py", line 851, in start
    self._started.wait()
  File "/usr/lib/python3.6/threading.py", line 548, in wait
    with self._cond:
  File "/usr/lib/python3.6/threading.py", line 240, in __enter__
    return self._lock.__enter__()
KeyboardInterrupt

Any new's ? i'm not good enought to understand what @matjohn2 did on his code to add the hard timeout.

It seems that I am affected by the same bug but I am not sure what triggers it because it used to work fine, now it always happens with py27 only, with py36 working just fine.

.tox/py27/lib/python2.7/site-packages/docker/transport/sshconn.py:84: in __init__
    self._connect()
.tox/py27/lib/python2.7/site-packages/docker/transport/sshconn.py:94: in _connect
    parsed.hostname, parsed.port, parsed.username,
.tox/py27/lib/python2.7/site-packages/paramiko/client.py:397: in connect
    t.start_client(timeout=timeout)
.tox/py27/lib/python2.7/site-packages/paramiko/transport.py:583: in start_client
    event.wait(0.1)
../../.pyenv/versions/2.7.15/lib/python2.7/threading.py:614: in wait
    self.__cond.wait(timeout)
../../.pyenv/versions/2.7.15/lib/python2.7/threading.py:359: in wait
    _sleep(delay)

Hi, I have the same problem. It seems to be stuck in the Packetizer.read_all() function where it is stuck in the socket timeout while it still has 16 bytes to read. The script runs all the way to the end, but a thread does not end. This used to work yesterday, the only difference is that the script now takes longer.

I am running with python 3.6

We had paramiko get stuck after multiple connection timeouts / disconnects and this helped:

            except (socket.timeout, paramiko.ssh_exception.SSHException, ConnectionError, EOFError) as e:
                self.sshclient.close()  # Make sure this session is closed. Otherwise paramiko hangs.

I am getting hit with similar issue with commands randomly freezing when using it with SSH agent. [Using python 2.7.5 from CentOS 7.6]

here is a link to a workaround I used for a similar issue: https://github.com/paramiko/paramiko/issues/109#issuecomment-111621658

I am having the same issue. My code hangs randomly when calling ssh_conn.exec_command(command). I checked with strace and I saw this:
[root@vnfm-dialog-nms vnfm]# strace -p 24796 strace: Process 24796 attached futex(0x1c9c2f0, FUTEX_WAIT_PRIVATE, 0, NULL
[Using Python 2.7.5 on CentOS Linux release 7.6.1810 (Core)]

I also seem to be hitting a deadlock when writing to a remote SFTPFile. Here's some log output from paramiko:

...
2019-11-21 20:09:07,129 paramiko.transport _log DEBUG    starting thread (client mode): 0x9f78a908
2019-11-21 20:09:07,130 paramiko.transport _log DEBUG    Local version/idstring: SSH-2.0-paramiko_2.6.0
2019-11-21 20:09:07,213 paramiko.transport _log DEBUG    Remote version/idstring: SSH-2.0-OpenSSH_5.3
2019-11-21 20:09:07,214 paramiko.transport _log INFO     Connected (version 2.0, client OpenSSH_5.3)
2019-11-21 20:09:07,292 paramiko.transport _log DEBUG    kex algos:['diffie-hellman-group-exchange-sha256', 'diffie-hellman-group-exchange-sha1', 'diffie-hellman-group14-sha1', 'diffie-hellman-group1-sha1'] server key:['ssh-rsa', 'ssh-dss'] client encrypt:['aes128-ctr', 'aes192-ctr', 'aes256-ctr', 'arcfour256', 'arcfour128', 'aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'aes192-cbc', 'aes256-cbc', 'arcfour', '[email protected]'] server encrypt:['aes128-ctr', 'aes192-ctr', 'aes256-ctr', 'arcfour256', 'arcfour128', 'aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'aes192-cbc', 'aes256-cbc', 'arcfour', '[email protected]'] client mac:['hmac-md5', 'hmac-sha1', '[email protected]', 'hmac-ripemd160', '[email protected]', 'hmac-sha1-96', 'hmac-md5-96'] server mac:['hmac-md5', 'hmac-sha1', '[email protected]', 'hmac-ripemd160', '[email protected]', 'hmac-sha1-96', 'hmac-md5-96'] client compress:['none', '[email protected]'] server compress:['none', '[email protected]'] client lang:[''] server lang:[''] kex follows?False
2019-11-21 20:09:07,292 paramiko.transport _log DEBUG    Kex agreed: diffie-hellman-group-exchange-sha256
2019-11-21 20:09:07,292 paramiko.transport _log DEBUG    HostKey agreed: ssh-rsa
2019-11-21 20:09:07,293 paramiko.transport _log DEBUG    Cipher agreed: aes128-ctr
2019-11-21 20:09:07,293 paramiko.transport _log DEBUG    MAC agreed: hmac-sha1
2019-11-21 20:09:07,293 paramiko.transport _log DEBUG    Compression agreed: none
2019-11-21 20:09:07,425 paramiko.transport _log DEBUG    Got server p (2048 bits)
2019-11-21 20:09:07,606 paramiko.transport _log DEBUG    kex engine KexGexSHA256 specified hash_algo <built-in function openssl_sha256>
2019-11-21 20:09:07,607 paramiko.transport _log DEBUG    Switch to new keys ...
2019-11-21 20:09:07,607 paramiko.transport _log DEBUG    Attempting password auth...
2019-11-21 20:09:07,991 paramiko.transport _log DEBUG    userauth is OK
2019-11-21 20:09:08,415 paramiko.transport _log INFO     Authentication (password) successful!
2019-11-21 20:09:08,415 paramiko.transport _log DEBUG    [chan 0] Max packet in: 32768 bytes
2019-11-21 20:09:08,497 paramiko.transport _log DEBUG    [chan 0] Max packet out: 32768 bytes
2019-11-21 20:09:08,497 paramiko.transport _log DEBUG    Secsh channel 0 opened.
2019-11-21 20:09:08,579 paramiko.transport _log DEBUG    [chan 0] Sesch channel 0 request ok
2019-11-21 20:09:08,668 paramiko.transport.sftp _log INFO     [chan 0] Opened sftp connection (server version 3)
2019-11-21 20:09:08,668 paramiko.transport.sftp _log DEBUG    [chan 0] stat(b'/my_remote_file.txt')
...
2019-11-21 20:09:08,759 paramiko.transport.sftp _log DEBUG    [chan 0] open(b'/googleads_2019-11-21.tab', 'w')
2019-11-21 20:09:08,842 paramiko.transport.sftp _log DEBUG    [chan 0] open(b'/googleads_2019-11-21.tab', 'w') -> 00000000
^C2019-11-21 20:09:25,704 paramiko.transport.sftp _log DEBUG    [chan 0] close(00000000)
...
Traceback (most recent call last):
  ...
  File "/usr/local/lib/python3.7/site-packages/paramiko/file.py", line 405, in write
    self._write_all(data)
  File "/usr/local/lib/python3.7/site-packages/paramiko/file.py", line 522, in _write_all
    count = self._write(data)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_file.py", line 208, in _write
    t, msg = self.sftp._read_response(req)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp_client.py", line 843, in _read_response
    t, data = self._read_packet()
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp.py", line 201, in _read_packet
    x = self._read_all(4)
  File "/usr/local/lib/python3.7/site-packages/paramiko/sftp.py", line 185, in _read_all
    x = self.sock.recv(n)
  File "/usr/local/lib/python3.7/site-packages/paramiko/channel.py", line 699, in recv
    out = self.in_buffer.read(nbytes, self.timeout)
  File "/usr/local/lib/python3.7/site-packages/paramiko/buffered_pipe.py", line 160, in read
    self._cv.wait(timeout)
  File "/usr/local/lib/python3.7/threading.py", line 296, in wait
    waiter.acquire()
KeyboardInterrupt
...
2019-11-21 20:09:25,806 paramiko.transport.sftp _log INFO     [chan 0] sftp session closed.
2019-11-21 20:09:25,807 paramiko.transport _log DEBUG    [chan 0] EOF sent (0)

The code does something like this:

with sftp_connection.open(str(remote_file_path), mode="w") as remote_file:
    for data in datasource:
        remote_file.write(data)
    logger.info("Content of local file %s written to remote file %s", file.name, remote_file_path)

Anyone got a workaround? Getting spurious hangs - can't reliably reproduce

I could somehow reproduce it when I was running commands that were taking very long to return a result. I changed the way that I was using it by calling quick commands (that will return a value quickly) and these commands were triggering jobs in background

I face this issue as well.

(gdb) info threads
Id Target Id Frame
1 Thread 0x7ffff7fe6080 (LWP 10429) "python" 0x00007ffff713703f in __GI___select (nfds=nfds@entry=0, readfds=readfds@entry=0x0,
writefds=writefds@entry=0x0, exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7fffffffcd50) at ../sysdeps/unix/sysv/linux/select.c:41

  • 4 Thread 0x7ffff1d65700 (LWP 10438) "python" sock_recv_guts (s=s@entry=0x7ffff2437c70, cbuf=0x7ffff23bfb54 "exc_traceback", len=,
    flags=0) at /root/python_inst/Python-2.7.18/Modules/socketmodule.c:2448
    (gdb) py-bt
    Traceback (most recent call first):
    File "/usr/local/lib/python2.7/site-packages/paramiko/packet.py", line 250, in read_all
    x = self.__socket.recv(n)
    File "/usr/local/lib/python2.7/site-packages/paramiko/packet.py", line 387, in read_message
    header = self.read_all(self.__block_size_in, check_rekey=True)
    File "/usr/local/lib/python2.7/site-packages/paramiko/transport.py", line 1608, in run
    ptype, m = self.packetizer.read_message()
    File "/usr/local/lib/python2.7/threading.py", line 801, in __bootstrap_inner
    self.run()
    File "/usr/local/lib/python2.7/threading.py", line 774, in __bootstrap
    self.__bootstrap_inner()
    (gdb) n

I got around by putting timeout in exec_command(timeout=x) and habdle the exception.

I have observed the same issue as mentioned above, where the it just hangs. Is there any workaround?

stdin, stdout, stderr = ssh.exec_command(cmd, timeout=ssh_timeout) # Close the unintended channel stdout.channel.shutdown_write() stdin.close() stdout_text = "" # wait for exit status when ready while not stdout.channel.exit_status_ready(): if stdout.channel.recv_ready(): stdout_line = stdout.channel.recv(1024) stdout_text += stdout_line.decode().strip() stderr_text = stderr.read().decode().strip() self.cmd_return_code.value = stdout.channel.recv_exit_status() # Close channels stdout.close() stderr.close() if self.queue: self.queue.put(stdout_text) if self.errQueue: self.errQueue.put(stderr_text)

We encountered the exact same issue as @DrPyser, with the same stacktrace hanging on waiter.acquire() when reading write responses.

We nailed it down to the default implementation of paramiko.buffered_pipe.BufferedPipe.read() (see http://docs.paramiko.org/en/stable/api/buffered_pipe.html#paramiko.buffered_pipe.BufferedPipe.read) - if you look closely at the documentation of the timeout parameter, it says "maximum seconds to wait (or None, the default, to wait forever)"

We explicitly set the socket channel timeout on the channel in a way similar to this:

host = '127.0.0.1' # Your host here
sftp_port = 22 # Your port here
user = 'user' # Your username here
password = 'password' # You password here
connection_timeout = 10 # Your timeout in seconds here

connection_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
connection_socket.settimeout(connection_timeout) # <-- We were doing this, but it's overwritten when creating Transport
connection_socket.connect((host, sftp_port))
transport = paramiko.Transport(sock=connection_socket)
transport.connect(username=user, password=password)
session = paramiko.SFTPClient.from_transport(transport)

# RELEVANT CHANGE - Set the timeout again, otherwise it is not used in file tranfers
session.get_channel().settimeout(connection_timeout)

By doing this, any subsequent call to session.put() will throw a PipeTimeout exception instead of waiting forever.

Hi from 2021,

This bug still bites (using paramiko 2.7.2 via fabric on FreeBSD). :(

Was this page helpful?
0 / 5 - 0 ratings

Related issues

bitprophet picture bitprophet  路  31Comments

colinmcintosh picture colinmcintosh  路  54Comments

ghost picture ghost  路  40Comments

sprenge picture sprenge  路  16Comments

cadedaniel picture cadedaniel  路  19Comments