Fixed data race in all_type_info in free-threading mode #5419

vfdev-5 · 2024-10-24T12:47:42Z

Description

fixed data race in all_type_info in free-threading mode
added test

For example, we have 2 threads entering all_type_info. Both enter all_type_info_get_cache`` function and there is a first one which inserts a tuple (type, empty_vector) to the map and second is waiting. Inserting thread gets the (iter_to_key, True) and non-inserting thread after waiting gets (iter_to_key, False). Inserting thread than will add a weakref and will then call into all_type_info_populate. However, non-inserting thread is not entering if (ins.second) {clause and returnsins.first->second;`` which is just empty_vector. Finally, non-inserting thread is failing the check in allocate_layout:

if (n_types == 0) {
    pybind11_fail(
        "instance allocation failed: new instance has no pybind11-registered base types");
}

On master running this test gives:

terminate called after throwing an instance of 'std::runtime_error'
  what():  instance allocation failed: new instance has no pybind11-registered base types
Fatal Python error: Aborted

Suggested changelog entry:

cc @colesbury

Description: - fixed data race all_type_info_populate in free-threading mode - added test For example, we have 2 threads entering `all_type_info`. Both enter `all_type_info_get_cache`` function and there is a first one which inserts a tuple (type, empty_vector) to the map and second is waiting. Inserting thread gets the (iter_to_key, True) and non-inserting thread after waiting gets (iter_to_key, False). Inserting thread than will add a weakref and will then call into `all_type_info_populate`. However, non-inserting thread is not entering `if (ins.second) {` clause and returns `ins.first->second;`` which is just empty_vector. Finally, non-inserting thread is failing the check in `allocate_layout`: ```c++ if (n_types == 0) { pybind11_fail( "instance allocation failed: new instance has no pybind11-registered base types"); } ```

colesbury · 2024-10-24T18:07:51Z

I think we may need a further rework of all_type_info(). It's used by get_type_info() and some of the callers to that function modify the returned detail::type_info*.

I'm not entirely sure what the locking strategy should be here. It's possible that all the callers to get_type_info() and all_type_info() should hold the internals lock.

colesbury · 2024-10-24T21:10:45Z

On further thought, I think this approach makes sense. I don't think we need to rework the locking strategy and can address the other data races on type_info fields separately.

I'll write up a summary of the issues.

vfdev-5 · 2024-11-04T17:30:49Z

@rwgk can you please review this PR?

rwgk · 2024-11-05T05:14:49Z

Will try tomorrow.

@cryos could you please also take a look?

rwgk

@colesbury could you please formally approve?

Looks good to me, but I'm not a free-threading expert. I'll merge when I see approvals from @colesbury and @cryos.

include/pybind11/detail/type_caster_base.h

include/pybind11/pybind11.h

tests/pybind11_tests.h

tests/test_class.py

rwgk

Looks good to me, but waiting for approvals from @colesbury and @cryos.

colesbury

Yes, this LGTM.

I'll try to put up a follow up PR later this week for the remaining issues described in #5421.

vfdev-5 force-pushed the fix-all_type_info_populate-free-threading branch 2 times, most recently from d98dd4b to 8beaa20 Compare October 24, 2024 13:31

vfdev-5 changed the title ~~Fix data race all_type_info_populate in free-threading mode~~ Fixed data race in all_type_info in free-threading mode Oct 24, 2024

vfdev-5 force-pushed the fix-all_type_info_populate-free-threading branch from 7d1a270 to 1a07f3c Compare October 24, 2024 13:38

vfdev-5 force-pushed the fix-all_type_info_populate-free-threading branch from ef3fb1c to 6ab21db Compare October 24, 2024 13:42

style: pre-commit fixes

adbbbba

vfdev-5 force-pushed the fix-all_type_info_populate-free-threading branch from cdc1663 to adbbbba Compare October 24, 2024 14:03

colesbury mentioned this pull request Oct 24, 2024

[BUG]: type_info data races in free threaded Python #5421

Open

3 tasks

Merge branch 'master' into fix-all_type_info_populate-free-threading

68dfe28

rwgk reviewed Nov 6, 2024

View reviewed changes

Addressed PR comments

22723d0

vfdev-5 force-pushed the fix-all_type_info_populate-free-threading branch from 4336d12 to 22723d0 Compare November 6, 2024 13:13

Merge branch 'master' into fix-all_type_info_populate-free-threading

e500447

rwgk approved these changes Nov 6, 2024

View reviewed changes

colesbury approved these changes Nov 6, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed data race in all_type_info in free-threading mode #5419

Fixed data race in all_type_info in free-threading mode #5419

vfdev-5 commented Oct 24, 2024 •

edited

Loading

colesbury commented Oct 24, 2024

colesbury commented Oct 24, 2024

vfdev-5 commented Nov 4, 2024

rwgk commented Nov 5, 2024

rwgk left a comment

rwgk left a comment

colesbury left a comment

Fixed data race in all_type_info in free-threading mode #5419

Are you sure you want to change the base?

Fixed data race in all_type_info in free-threading mode #5419

Conversation

vfdev-5 commented Oct 24, 2024 • edited Loading

Description

Suggested changelog entry:

colesbury commented Oct 24, 2024

colesbury commented Oct 24, 2024

vfdev-5 commented Nov 4, 2024

rwgk commented Nov 5, 2024

rwgk left a comment

Choose a reason for hiding this comment

rwgk left a comment

Choose a reason for hiding this comment

colesbury left a comment

Choose a reason for hiding this comment

vfdev-5 commented Oct 24, 2024 •

edited

Loading