Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deadlock in paramiko-ng 2.8.10 #145

Open
harixxxx opened this issue Dec 17, 2022 · 6 comments
Open

Deadlock in paramiko-ng 2.8.10 #145

harixxxx opened this issue Dec 17, 2022 · 6 comments

Comments

@harixxxx
Copy link

harixxxx commented Dec 17, 2022

Ran into a deadlock issue. We recently switched from paramiko to paramiko-ng. Not sure if the same problem existed before. There were a large number of threads (about 125) using paramiko at the same time.

  • The application is a pytest based testing framework
  • It calls paramiko.util.log_to_file() to send log output to a file

Almost all the threads using paramiko are stuck at the same place in python logging module as shown below.

Threads stuck here - an example:
  File "/usr/local/lib/python3.7/dist-packages/paramiko/packet.py", line 307, in read_all
    self._check_keepalive()
  File "/usr/local/lib/python3.7/dist-packages/paramiko/packet.py", line 566, in _check_keepalive
    self.__keepalive_callback()
  File "/usr/local/lib/python3.7/dist-packages/paramiko/transport.py", line 1043, in _request
    return x.global_request('[email protected]', wait=False)
  File "/usr/local/lib/python3.7/dist-packages/paramiko/transport.py", line 1071, in global_request
    self._log(DEBUG, 'Sending global request "%r"', kind)
  File "/usr/local/lib/python3.7/dist-packages/paramiko/transport.py", line 1697, in _log
    self.logger.log(level, msg, *args, **kwargs)
  File "/usr/lib/python3.7/logging/__init__.py", line 1449, in log
    self._log(level, msg, args, **kwargs)
  File "/usr/lib/python3.7/logging/__init__.py", line 1519, in _log
    self.handle(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 1529, in handle
    self.callHandlers(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 1591, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 903, in handle
    self.acquire()
  File "/usr/lib/python3.7/logging/__init__.py", line 854, in acquire
    self.lock.acquire()

Looks like this thread is the culprit.
It first acquired the logging module's lock and got stuck acquiring paramiko channel's lock.
Some other thread (that's stuck in the logging module) must have acquired the paramiko lock this thread is stuck on,
leading to a deadlock.

The self.stream.flush() below is supposed to be flushing the stream handler set up by paramiko.util.log_to_file(),
but somehow it ends up in  paramiko channel deletion code.

2022-12-10 12:55:00 (UTC) | 7009 |    DEBUG :         | ThreadPoolExecutor-2_0 | Stacktrace of <paramiko.Transport at 0x1401f898 (cipher aes128-ctr, 128 bits) (active; 2 open channel(s))>:
  File "/usr/lib/python3.7/threading.py", line 885, in _bootstrap
    self._bootstrap_inner()
  File "/usr/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/usr/local/lib/python3.7/dist-packages/paramiko/transport.py", line 1950, in run
    self._channel_handler_table[ptype](chan, m)
  File "/usr/local/lib/python3.7/dist-packages/paramiko/channel.py", line 1174, in _handle_close
    msgs = self._close_internal()
  File "/usr/local/lib/python3.7/dist-packages/paramiko/channel.py", line 1247, in _close_internal
    m1 = self._eof_internal()
  File "/usr/local/lib/python3.7/dist-packages/paramiko/channel.py", line 1240, in _eof_internal
    self._log(DEBUG, 'EOF sent ({})'.format(self._name))
  File "/usr/local/lib/python3.7/dist-packages/paramiko/channel.py", line 1204, in _log
    self.logger.log(level, "[chan " + self._name + "] " + msg, *args)
  File "/usr/lib/python3.7/logging/__init__.py", line 1449, in log
    self._log(level, msg, args, **kwargs)
  File "/usr/lib/python3.7/logging/__init__.py", line 1519, in _log
    self.handle(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 1529, in handle
    self.callHandlers(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 1591, in callHandlers
    hdlr.handle(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 905, in handle
    self.emit(record)
  File "/usr/lib/python3.7/logging/__init__.py", line 1038, in emit
    self.flush()
  File "/usr/lib/python3.7/logging/__init__.py", line 1018, in flush
    self.stream.flush()
  File "/usr/local/lib/python3.7/dist-packages/paramiko/file.py", line 66, in __del__
    self.close()
  File "/usr/local/lib/python3.7/dist-packages/paramiko/channel.py", line 1404, in close
    self.channel.shutdown_write()
  File "/usr/local/lib/python3.7/dist-packages/paramiko/channel.py", line 980, in shutdown_write
    self.shutdown(1)
  File "/usr/local/lib/python3.7/dist-packages/paramiko/channel.py", line 949, in shutdown
    self.lock.acquire()
@harixxxx
Copy link
Author

This is not a deadlock between 2 threads. The transport thread tried to acquire a channel lock a second time, while holding the logging lock, blocking itself & all other threads trying to log.

@ploxiln
Copy link
Owner

ploxiln commented Dec 30, 2022

Very interesting ... __del__ is the destructor which is called automatically before the object (BufferedFile) is freed, possibly because the reference count decremented to zero. Was a logging argument the very last reference to the Channel's BufferedFile? The Channel was already in the process of closing ...

@harixxxx
Copy link
Author

log_to_file() was called by the app to set up paramiko logging to a file.
So paramiko logging should always be writing to that file.
It should not be writing to or deleting a channel's BufferedFile.
How did the logger instance used by paramiko become associated with a channel's BufferedFile?

@ploxiln
Copy link
Owner

ploxiln commented Jan 1, 2023

I don't think the logger instance was explicitly associated with the channel BufferedFile. It still seems to me like the python GC decided to "free" the BufferedFile at that point in the logging code, causing its destructor method to be called. It has been said to be inadvisable to put any significant logic in destructors, and paramiko channel locking has caused tricky issues before.

Anyway, I think the thing to do, is to make the channel close logic detect when it's already closing, and skip it before trying to take that channel lock, in some reasonably elegant way ...

@harixxxx
Copy link
Author

harixxxx commented Jan 1, 2023

This problem happens with moderate frequency in our test environment. Every time the stacktrace is the same - happens when handling MSG_CHANNEL_CLOSE.

@harixxxx
Copy link
Author

@ploxiln Are you planning on fixing this issue?
This other issue also seems to be related - during object destruction:
#148

A more general question: There hasn't been a new release since Jan 2022. Would it be fair to conclude that paramiko-ng will not be actively managed going forward?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants