Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-70764: inspect.getclosurevars now identifies global variables with LOAD_GLOBAL #120143

Merged
merged 4 commits into from
Nov 5, 2024

Conversation

blhsing
Copy link
Contributor

@blhsing blhsing commented Jun 6, 2024

inspect.getclosurevars now identifies global variables with LOAD_GLOBAL

Currently inspect.getclosurevars identifies a global variable by testing if a name in co_names exists in the function's global namespace, but this approach can result in incorrectly classifying an attribute name as a global variable when the name exists both as an attriute and as a global variable.

This PR fixes the issue by using the LOAD_GLOBAL and LOAD_ATTR instructions instead to identify global names and unbound names, respectively.

@carljm
Copy link
Member

carljm commented Jun 6, 2024

Thanks for the contribution!

This could make getclosurevars much slower if run on a very large function, because bytecode size can scale much faster than the set of names used.

In particular, if a large function has lots of references to builtins (even the same builtin used many times) we will repeatedly raise and catch an exception, possibly many times for the same builtin name. This repeated exception raising could at least be mitigated by keeping a set of seen global names, and not re-looking-up an already-seen name (or just avoiding re-looking-up names already present in builtin_vars).

That said, I'm not sure I see any other implementation strategy for fixing this function; we just don't preserve enough context from compilation on the code object, except in the bytecode itself.

Would like to see if any other reviewer has better ideas here, before merging this.

@blhsing
Copy link
Contributor Author

blhsing commented Jun 7, 2024

This could make getclosurevars much slower if run on a very large function, because bytecode size can scale much faster than the set of names used.

In particular, if a large function has lots of references to builtins (even the same builtin used many times) we will repeatedly raise and catch an exception, possibly many times for the same builtin name. This repeated exception raising could at least be mitigated by keeping a set of seen global names, and not re-looking-up an already-seen name (or just avoiding re-looking-up names already present in builtin_vars).

That said, I'm not sure I see any other implementation strategy for fixing this function; we just don't preserve enough context from compilation on the code object, except in the bytecode itself.

Would like to see if any other reviewer has better ideas here, before merging this.

Thanks for the review! I totally agree with your assessments. I've updated the approach so that it collects all global names in a set first before looking them up in global and builtin namespaces. And yeah unfortunately I don't see a way not to scan through all the bytecodes if we are to aim for accuracy.

Copy link
Member

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will still be a slowdown, but not raising a lot of exceptions should help a lot, and inspect methods are often slow; correctness seems more important. And no better idea for how to make this method correct has surfaced; I don't think there is one. Going to go ahead and rebase this to make sure no other changes have impacted it in the last months, and then merge if signal looks good. Thanks again for the contribution!

@carljm carljm merged commit 83ba8c2 into python:main Nov 5, 2024
36 checks passed
@carljm carljm added the needs backport to 3.13 bugs and security fixes label Nov 5, 2024
@miss-islington-app
Copy link

Thanks @blhsing for the PR, and @carljm for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

@carljm carljm added the needs backport to 3.12 bug and security fixes label Nov 5, 2024
@miss-islington-app
Copy link

Thanks @blhsing for the PR, and @carljm for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 5, 2024
…s with LOAD_GLOBAL (pythonGH-120143)

(cherry picked from commit 83ba8c2)

Co-authored-by: blhsing <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Nov 5, 2024

GH-126459 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Nov 5, 2024
miss-islington pushed a commit to miss-islington/cpython that referenced this pull request Nov 5, 2024
…s with LOAD_GLOBAL (pythonGH-120143)

(cherry picked from commit 83ba8c2)

Co-authored-by: blhsing <[email protected]>
@bedevere-app
Copy link

bedevere-app bot commented Nov 5, 2024

GH-126460 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Nov 5, 2024
carljm pushed a commit that referenced this pull request Nov 6, 2024
…es with LOAD_GLOBAL (GH-120143) (#126460)

gh-70764: inspect.getclosurevars now identifies global variables with LOAD_GLOBAL (GH-120143)
(cherry picked from commit 83ba8c2)

Co-authored-by: blhsing <[email protected]>
carljm pushed a commit that referenced this pull request Nov 6, 2024
…es with LOAD_GLOBAL (GH-120143) (#126459)

gh-70764: inspect.getclosurevars now identifies global variables with LOAD_GLOBAL (GH-120143)
(cherry picked from commit 83ba8c2)

Co-authored-by: blhsing <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants