-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added exception translator specific mutex used with try_translate_exceptions #5362
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like the right approach to me.
- I'd consider passing
internals.registered_exception_translators
to the callback and maybe rename it towith_exception_translators
(instead of passing the internals struct). - I think this is an ABI change for the free-threaded build due to the new mutex in
internals
, so I thinkPYBIND11_INTERNALS_VERSION
needs to be incremented. It may make sense to only change it for the free-threaded build. - A regression test derived from your bug report might be a good idea
f20ebd9
to
591cb73
Compare
This LGTM, but needs review from the pybind11 maintainers |
Thanks for the feedback @colesbury ! |
(I'll look asap. I just changed companies and I'm still in the middle of getting set up. At the moment I don't even see when I'm getting tagged on GitHub, not sure why.) |
@@ -39,7 +39,11 @@ | |||
# if PY_VERSION_HEX >= 0x030C0000 || defined(_MSC_VER) | |||
// Version bump for Python 3.12+, before first 3.12 beta release. | |||
// Version bump for MSVC piggy-backed on PR #4779. See comments there. | |||
# define PYBIND11_INTERNALS_VERSION 5 | |||
# ifdef Py_GIL_DISABLED | |||
# define PYBIND11_INTERNALS_VERSION 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless #5296 is merged first, this will lead to incompatibility between extension modules compiled with or without Py_GIL_DISABLED
defined. — EDIT next day: THIS IS NOT AN ISSUE. See comments posted after this one.
I just only now realize ... Huston we have a problem?
We already added pymutex mutex
to struct internals
but didn't change PYBIND11_INTERNALS_VERSION
. Ouch? Sorry I missed that before.
What do we want here? I'm very uncertain, @colesbury could you please advise?
struct internals {
#if PY_VERSION_HEX >= 0x030D0000
pymutex mutex;
pymutex exception_translator_mutex;
#endif
...
Could that work? The idea here is that struct internals
is the same with and without Py_GIL_DISABLED
defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Free-threading is experimental so changing the ABI before was fine. Now I think packages might be beginning to ship, so probably not such a great idea to change without a bump. But not fond of these confusing bumps. The ABI is currently a mess, by the way - the Windows only bump broke conda-forge and the only solution was to pin <2.12, which now is breaking packages (see the feedstock for more info).
Will the internals version matter after #5296? Could we just basically do away with the concept, and only do compiler/stdlib, etc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just basically do away with the concept
Not entirely, I think. The internals versions still need to match for the inheritance functionality to work correctly. (I'd probably need a day or two to figure out conclusively if that's fixable or just fundamentally impossible.)
But the more I think about it, the more I'm getting worried about having a different struct internals
with and without Py_GIL_DISABLED
defined:
-
pybind11/include/pybind11/detail/internals.h
Lines 177 to 180 in 5efc743
struct internals { #ifdef Py_GIL_DISABLED pymutex mutex; #endif -
pybind11/include/pybind11/detail/internals.h
Lines 185 to 190 in 5efc743
#ifdef Py_GIL_DISABLED std::unique_ptr<instance_map_shard[]> instance_shards; // void * -> instance* size_t instance_shards_mask; #else instance_map registered_instances; // void * -> instance* #endif
What happens if there is one extension built with the GIL enabled, another built with free-threading, and both are imported into the same process?
Currently we only have this:
pybind11/include/pybind11/detail/common.h
Lines 390 to 405 in 5efc743
#define PYBIND11_CHECK_PYTHON_VERSION \ { \ const char *compiled_ver \ = PYBIND11_TOSTRING(PY_MAJOR_VERSION) "." PYBIND11_TOSTRING(PY_MINOR_VERSION); \ const char *runtime_ver = Py_GetVersion(); \ size_t len = std::strlen(compiled_ver); \ if (std::strncmp(runtime_ver, compiled_ver, len) != 0 \ || (runtime_ver[len] >= '0' && runtime_ver[len] <= '9')) { \ PyErr_Format(PyExc_ImportError, \ "Python version mismatch: module was compiled for Python %s, " \ "but the interpreter version is incompatible: %s.", \ compiled_ver, \ runtime_ver); \ return nullptr; \ } \ }
Do we need to also check for Py_GIL_DISABLED
compatibility there?
Or do we have to do something here?
pybind11/include/pybind11/detail/internals.h
Lines 324 to 332 in 5efc743
#define PYBIND11_INTERNALS_ID \ "__pybind11_internals_v" PYBIND11_TOSTRING(PYBIND11_INTERNALS_VERSION) \ PYBIND11_INTERNALS_KIND PYBIND11_COMPILER_TYPE PYBIND11_STDLIB \ PYBIND11_BUILD_ABI PYBIND11_BUILD_TYPE "__" #define PYBIND11_MODULE_LOCAL_ID \ "__pybind11_module_local_v" PYBIND11_TOSTRING(PYBIND11_INTERNALS_VERSION) \ PYBIND11_INTERNALS_KIND PYBIND11_COMPILER_TYPE PYBIND11_STDLIB \ PYBIND11_BUILD_ABI PYBIND11_BUILD_TYPE "__"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if there is one extension built with the GIL enabled, another built with free-threading, and both are imported into the same process?
You can't have extensions compiled with and without Py_GIL_DISABLED
in the same process. An extension might require the GIL to be dynamically enabled, but that's different: that's a matter of calling (or not calling) PyUnstable_Module_SetGIL()
. The extension will still need to be compiled with Py_GIL_DISABLED
to be loaded in 3.13t.
In other words:
Py_GIL_DISABLED
is part of the extension ABI, similar toPy_VERSION_HEX
- It's part of the wheel tag (i.e., the "t" in
cp313-cp313t
), just like the version number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or as @henryiii wrote in #5148 (comment): "A free-threaded build can't talk to a non-free-threaded build".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I should have built using same python version, e.g. 3.13t and 3.13.
Doing so, above example is giving either a segfault:
$ gdb --args python3.13t main.py
Starting program: /usr/bin/python3.13t main.py
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7be1f0b in pybind11::dict::dict(pybind11::object&&) () from /pybind11_playground/ft_or_not_ft/test_gil.so
(gdb) bt
#0 0x00007ffff796bf0b in pybind11::dict::dict(pybind11::object&&) () from /pybind11_playground/ft_or_not_ft/test_gil.so
#1 0x00007ffff79683c5 in pybind11::detail::get_internals() () from /pybind11_playground/ft_or_not_ft/test_gil.so
#2 0x00007ffff7967bdd in PyInit_test_gil () from /pybind11_playground/ft_or_not_ft/test_gil.so
#3 0x00000000005e7800 in ?? ()
or
$ python3.13 main.py
Traceback (most recent call last):
File "/pybind11_playground/ft_or_not_ft/main.py", line 4, in <module>
import test_ft
ImportError: /pybind11_playground/ft_or_not_ft/test_ft.so: undefined symbol: _Py_DecRefShared
If this is again unrelated sorry for the noise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot! That really helps me understand the situation better. So I think what we need to do is work on this code:
pybind11/include/pybind11/detail/common.h
Lines 390 to 405 in 5efc743
#define PYBIND11_CHECK_PYTHON_VERSION \ | |
{ \ | |
const char *compiled_ver \ | |
= PYBIND11_TOSTRING(PY_MAJOR_VERSION) "." PYBIND11_TOSTRING(PY_MINOR_VERSION); \ | |
const char *runtime_ver = Py_GetVersion(); \ | |
size_t len = std::strlen(compiled_ver); \ | |
if (std::strncmp(runtime_ver, compiled_ver, len) != 0 \ | |
|| (runtime_ver[len] >= '0' && runtime_ver[len] <= '9')) { \ | |
PyErr_Format(PyExc_ImportError, \ | |
"Python version mismatch: module was compiled for Python %s, " \ | |
"but the interpreter version is incompatible: %s.", \ | |
compiled_ver, \ | |
runtime_ver); \ | |
return nullptr; \ | |
} \ | |
} |
We want to catch if there is a mismatch between the Python interpreter kind (gil, free-threading) and the way the extension was built.
In a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I think packages might be beginning to ship, so probably not such a great idea to change without a bump. But not fond of these confusing bumps.
Yeah ... "version bump chaos"
Trying to summarize how I understand the situation:
-
The internals version wasn't changed for Python 3.13t — but that's ok because there were no released packages anyway that could have been built with a non-matching
struct internals
. -
Assuming there are released packages for 3.13t now, yes, it would indeed be safest to bump the internals version for 3.13t. — But pegging that on
Py_GIL_DISABLED
will make it very difficult for the general user community to reason about the internals version.
If this PR gets merged as is, we'll have internals version
4
for Python < 3.12 if_MSC_VER
is not defined.5
for Python 3.12, or 3.13, or if_MSC_VER
is defined.6
for Python 3.13t only.
Who can (or wants to) understand that? :-)
And then we also still have #5339 (smart_holder) and #5280 (native enum) that I'd really like to see merged.
I believe this is a situation where the maintainers need to step up to balance "making progress" with "minimizing confusion".
@henryiii wrote:
Free-threading is experimental
That makes me think, let's not mess with the internals version in this PR:
- Maybe a small number of users experimenting will stumble,
- but a much larger number of production users will not have to understand the weirdness summarized above.
WDYT?
Looking beyond that question, I feel it'd be best to approach this strategically:
- Merge feat: enable type-safe interoperability between different independent Python/C++ bindings systems. #5296 asap (only needs a small amount of work).
- Merge [Preview 3] Merge smart_holder into master (squash merge) #5339, [smart_holder] Conversions between Python's native (stdlib) enum types and C++ enums. #5280 as closely together in time as possible and start the pybind11 v3 series, with internals version
6
unconditionally.
That's a discussion far beyond this PR, but I'm mentioning it here because it would be another reason to not mess with the internals version here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes me think, let's not mess with the internals version in this PR
I'm concerned that landing this PR without changing the version number will cause problems not just for users but also package maintainers (e.g., scipy, PyTorch) that use pybind11 and support free-threading. The failure modes for different internals can be complicated -- in this case, I think you run the risk of treating uninitialized memory like a mutex.
I think the approach in this PR is a good long term fix, but if the version bump is the issue we can first pursue a different approach that avoids changing the internals struct (and consider landing this change later when #5280 is ready).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First priority, I want to be helpful. Second minimize potential for confusion.
Unfortunately I'm technically somewhat handicapped at the moment (no Linux workstation), but let me try to get #5296 merged asap, with @henryiii's formal approval, to be sure we're on the same.
After that version bumps will have a lot less impact, and it won't matter as much that people fully understand the numbering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, based on @colesbury's review; I don't fully understand the mutex issue (#5346) TBH; I won't have enough free cycles for a while to really get my head around this.
PR #5296 was merged, which means we don't have to worry about the internals version bump so much.
…translators - Incremented PYBIND11_INTERNALS_VERSION - Added a test
591cb73
to
8275a6a
Compare
@rwgk I applied the suggestions to this PR. Anything else I need to do here or it is good to be merged? |
Could you please add a Changlog entry to the PR description? Please see other open PRs for the template. |
@rwgk I updated the description of the PR with a changelog entry, feel free to update it if needed or let me know how I can update it. Thanks! |
Description
try_translate_exceptions
andregister_exception_translator
,register_local_exception_translator
.Fixes #5346
cc @colesbury
Suggested changelog entry: