-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in AST recursion depth tracking change of gh-95185 #106905
Comments
I can confirm this issue. It causes a bunch of INTERNALERROR in some specific case with py.test when it tries to report errors traces Reverting 0047447 locally solves the problem (actual error is properly reported) |
Might this be the root-cause for sqlalchemy/mako#378, pytest-dev/pytest#10874, home-assistant/core#95364, bieniu/ha-shellies-discovery#448 and other projects reporting |
@serhiy-storchaka do you perhaps have some information to help this issue along? |
…#106906) * gh-106905: avoid incorrect SystemError about recursion depth mismatch * Update Misc/NEWS.d/next/Core and Builtins/2023-07-20-11-41-16.gh-issue-106905.AyZpuB.rst --------- Co-authored-by: Shantanu <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
…smatch (pythonGH-106906) * pythongh-106905: avoid incorrect SystemError about recursion depth mismatch * Update Misc/NEWS.d/next/Core and Builtins/2023-07-20-11-41-16.gh-issue-106905.AyZpuB.rst --------- (cherry picked from commit 1447af7) Co-authored-by: Markus Mohrhard <[email protected]> Co-authored-by: Shantanu <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
Thanks @mmohrhard for the fix and @pablogsal for reviewing and merging the fix. Any thoughts on what conditions this bug will likely happen? We are also seeing the "recursion depth mismatch" error but only in production. So we are trying to construct a reproducible example so we could be more confident that bringing #106906 would fix our issues in production. |
This comment was marked as outdated.
This comment was marked as outdated.
…epth mismatch (python#106906) Backport of 1447af7 from python#106906. * pythongh-106905: avoid incorrect SystemError about recursion depth mismatch * Update Misc/NEWS.d/next/Core and Builtins/2023-07-20-11-41-16.gh-issue-106905.AyZpuB.rst --------- Co-authored-by: Shantanu <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
Agreed that a regression test for this would be good to have. |
#106906 didn't fix all errors in our case. We found a very large test that could reproduce about 1~2% of the time. Upon investigation, I discovered that during the lifetime of the
This means it's not safe to use the shared |
I constructed a reliable reproducible example: import ast
import difflib
import gc
import pathlib
import random
import threading
import unittest
def gc_callback(phase, info):
del phase, info
seq_a = [str(random.randint(0, 10)) + "\n" for _ in range(100)]
seq_b = [str(random.randint(0, 10)) + "\n" for _ in range(100)]
udiff = list(difflib.unified_diff(seq_a, seq_b))
udiff.append("THE END")
gc.callbacks.append(gc_callback)
NUM_RUNS = 100
class AstTest(unittest.TestCase):
def setUp(self):
super().setUp()
self.error_happened = None
def ast_parse(self, content):
try:
for _ in range(100):
ast.parse(content, "source")
except SystemError as e:
self.error_happened = e
def test_ast(self):
content = pathlib.Path(__file__).read_text()
for i in range(NUM_RUNS):
print(f"Run: {i}/{NUM_RUNS}", flush=True)
ts = []
for _ in range(100):
t = threading.Thread(target=self.ast_parse, args=(content,))
t.start()
ts.append(t)
for t in ts:
t.join()
self.assertIsNone(self.error_happened)
if __name__ == "__main__":
unittest.main() I have a patch to make I can send a PR, but I'm not sure what's the best way to construct the test / where to add the test. Any pointers? |
Hmm, this is only reproducible on Python 3.11, but not 3.12. The bug reports referenced here are all for Python 3.11 too. Looks like #97922 is relevant? Any other chances the GIL might still get released during |
Agreed, I see this on 3.11 builds not 3.12.1+. This is one of those situations where a reliable regression test is hard to construct given failure requires triggering a threading related race condition who's exact conditions are difficult to setup. If we want a regression test checked in at all (we may not)... I'd add such a test class within
Reproducing race conditions is hard and highly environment dependent. I can reproduce it reliably within the first iteration in a couple of seconds on a few different non-cloud-VM low spec machines, but it only reproduces some of the time within 100 iterations my modern high spec cloud VM (after spending 3 minutes trying). If it ever pops up as failing it'll do so in a flaky manner but would still serve as a sign that something needs investigation. |
That GC behavior change in 3.12+ does sound related to it not happening there. I suspect this is still worth fixing in main to not rely on the GIL for this depth tracking state regardless as it'd possibly show up again in free-threaded pep-703 builds? |
…on depths in PyAST_mod2obj.
I created a draft PR: #113035 with the implementation. Will wait for guidance on whether we need this in main for free-threaded builds. |
In 3.12 I changed the GC to run only on the eval breaker so it doesn't run now on object allocation. This may be the reason this doesn't reproduce (If I understand correctly) in 3.12. |
I've confirmed that the draft PR #113035 backported to 3.11 fixes the problem on a machine that immediately reproduces the problem with the above test code otherwise. 3.12 "got lucky" with that GC change and likely avoided this bug being possible (or at least far more rare if there are any other paths that could similarly enable it). yilei is also testing the PR on our 3.11 runtime at work to ensure it fixes our users problems. |
After patching our 3.11 runtime two days ago at work, I can confirm #113035 fixed the issue for us. |
…T_mod2obj call. (GH-113035) Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
… in each PyAST_mod2obj call. (pythonGH-113035) (cherry picked from commit 48c4973) Co-authored-by: Yilei Yang <[email protected]> Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
…ch PyAST_mod2obj call. (GH-113035) (GH-113472) (cherry picked from commit 48c4973) Co-authored-by: Yilei Yang <[email protected]> Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
…n depth in each PyAST_mod2obj call. (pythonGH-113035) (pythonGH-113472) (cherry picked from commit 48c4973) (cherry picked from commit d58a5f4) Co-authored-by: Serhiy Storchaka <[email protected]> Co-authored-by: Yilei Yang <[email protected]> Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
…ch PyAST_mod2obj call. (GH-113035) (GH-113472) (GH-113476) (cherry picked from commit 48c4973) (cherry picked from commit d58a5f4) Co-authored-by: Yilei Yang <[email protected]> Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
…h PyAST_mod2obj call. (pythonGH-113035) Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
…h PyAST_mod2obj call. (pythonGH-113035) Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
…smatch (python#106906) * pythongh-106905: avoid incorrect SystemError about recursion depth mismatch * Update Misc/NEWS.d/next/Core and Builtins/2023-07-20-11-41-16.gh-issue-106905.AyZpuB.rst --------- Co-authored-by: Shantanu <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
…h PyAST_mod2obj call. (pythonGH-113035) Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
Applies patch for python/cpython#106905
…smatch (python#106906) * pythongh-106905: avoid incorrect SystemError about recursion depth mismatch * Update Misc/NEWS.d/next/Core and Builtins/2023-07-20-11-41-16.gh-issue-106905.AyZpuB.rst --------- Co-authored-by: Shantanu <[email protected]> Co-authored-by: Serhiy Storchaka <[email protected]>
…h PyAST_mod2obj call. (pythonGH-113035) Co-authored-by: Gregory P. Smith [Google LLC] <[email protected]>
Bug report
The change in 0047447 seems to miss the recusrion depth adjustment in case of an error. As an example for some of the generated code:
Note that the
failed
code path is missing thestate->recursion_depth--;
statement.I found this as I'm trying to track down where spurious
SystemError: AST constructor recursion depth mismatch
errors in Python 3.11 are coming from. E.g.Your environment
Reproduced in Python 3.11 but the code in main looks the same.
Linked PRs
The text was updated successfully, but these errors were encountered: