Make sure connection._ready() does not stall on bad file descriptor#885
Make sure connection._ready() does not stall on bad file descriptor#885seamusabshere wants to merge 4 commits intoaio-libs:masterfrom
Conversation
With @emk We've observed "[Errno 2] No such file or directory" coming from add_writer(). When this happens, nobody notifies _waiter that there has been an exception and this hangs the program. We haven't managed to create a local repo - whenever we (for example) kill -9 a postgres process while an aiopg client is writing to it, we get the intended psycopg2.OperationalError("Connection closed"). Here's the backtrace of the error seen in the wild: Exception in callback Connection._ready(<weakref at 0...x7efddf120890>) at /root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.7/site-packages/aiopg/connection.py:779 handle: <Handle Connection._ready(<weakref at 0...x7efddf120890>) at /root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.7/site-packages/aiopg/connection.py:779> Traceback (most recent call last): [... app code ...] File "/usr/lib/python3.7/asyncio/base_events.py", line 566, in run_until_complete self.run_forever() File "/usr/lib/python3.7/asyncio/base_events.py", line 534, in run_forever self._run_once() File "/usr/lib/python3.7/asyncio/base_events.py", line 1771, in _run_once handle._run() File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run self._context.run(self._callback, *self._args) File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.7/site-packages/aiopg/connection.py", line 838, in _ready self._fileno, self._ready, weak_self # type: ignore File "/usr/lib/python3.7/asyncio/selector_events.py", line 334, in add_writer return self._add_writer(fd, callback, *args) File "/usr/lib/python3.7/asyncio/selector_events.py", line 294, in _add_writer (reader, handle)) File "/usr/lib/python3.7/selectors.py", line 389, in modify self._selector.modify(key.fd, selector_events) FileNotFoundError: [Errno 2] No such file or directory
this was causing sub-dependency version mismatch errors when updating to python 3.10
update async_timeout version to match upstream
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #885 +/- ##
==========================================
- Coverage 93.32% 93.21% -0.12%
==========================================
Files 12 12
Lines 1574 1577 +3
Branches 187 187
==========================================
+ Hits 1469 1470 +1
- Misses 73 75 +2
Partials 32 32 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Co-authored-by: Andrew Svetlov <andrew.svetlov@gmail.com>
|
@asvetlov updated, thanks for your suggestion! |
|
@asvetlov as i said in the description, i've found this to be impossible to test - can we get it merged without? note that we have been running this in production since the original PR |
|
Thank you to everyone for following up on this! I can confirm that this patch appears to have entirely fixed a bug where I'm not quite sure how to test it without mocking core Python network code to artificially fail. Here are the outstanding TODO items:
|
With @emk
We've observed "[Errno 2] No such file or directory" coming from add_writer(). When this happens, nobody notifies _waiter that there has been an exception and this hangs the program.
We haven't managed to create a local repo - whenever we (for example) kill -9 a postgres process while an aiopg client is writing to it, we get the intended psycopg2.OperationalError("Connection closed").
Here's the backtrace of the error seen in the wild:
Exception in callback Connection._ready(<weakref at 0...x7efddf120890>) at /root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.7/site-packages/aiopg/connection.py:779
handle: <Handle Connection._ready(<weakref at 0...x7efddf120890>) at /root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.7/site-packages/aiopg/connection.py:779>
Traceback (most recent call last):
[... app code ...]
File "/usr/lib/python3.7/asyncio/base_events.py", line 566, in run_until_complete
self.run_forever()
File "/usr/lib/python3.7/asyncio/base_events.py", line 534, in run_forever
self._run_once()
File "/usr/lib/python3.7/asyncio/base_events.py", line 1771, in _run_once
handle._run()
File "/usr/lib/python3.7/asyncio/events.py", line 88, in _run
self._context.run(self._callback, *self._args)
File "/root/.local/share/virtualenvs/app-4PlAip0Q/lib/python3.7/site-packages/aiopg/connection.py", line 838, in _ready
self._fileno, self._ready, weak_self # type: ignore
File "/usr/lib/python3.7/asyncio/selector_events.py", line 334, in add_writer
return self._add_writer(fd, callback, *args)
File "/usr/lib/python3.7/asyncio/selector_events.py", line 294, in _add_writer
(reader, handle))
File "/usr/lib/python3.7/selectors.py", line 389, in modify
self._selector.modify(key.fd, selector_events)
FileNotFoundError: [Errno 2] No such file or directory
What do these changes do?
Are there changes in behavior for the user?
Related issue number
Checklist
CHANGESfolder<issue_id>.<type>(e.g.588.bugfix)issue_idchange it to the pr id after creating the PR.feature: Signifying a new feature..bugfix: Signifying a bug fix..doc: Signifying a documentation improvement..removal: Signifying a deprecation or removal of public API..misc: A ticket has been closed, but it is not of interest to users.Fix issue with non-ascii contents in doctest text files.