Skip to content

Conversation

yoney
Copy link
Contributor

@yoney yoney commented Oct 13, 2025

This PR is more about explaining the reasoning behind the changes than about the code itself, which is actually quite minimal.


Starting with Windows (the easy one):

  • On Windows, the _uuid module uses UuidCreateSequential(). In the current GIL-enabled build, the GIL is already released while this function is called.
  • Although the documentation doesn’t explicitly state that UuidCreateSequential() is thread-safe, it does mention that if the computer doesn’t have an Ethernet or token ring address, the generated UUID is still a valid identifier and is guaranteed to be unique among all UUIDs generated on that machine. So, at the very least, we have local uniqueness.
  • Since the GIL is released (in the GIL-enabled build), I believe calling UuidCreateSequential() also be safe in a free-threaded build. I’ve added a test for it.

For the rest of the systems (Linux, macOS, Solaris, AIX, FreeBSD, etc.):

  • On these platforms, we call one of three functions: uuid_generate_time_safe(), uuid_create(), or uuid_generate_time().
  1. libuuid (uuid_generate_time_safe() and uuid_generate_time()):

    • On Linux, macOS, and Solaris, CPython uses libuuid. Although I couldn’t find any explicit statement in the documentation confirming that libuuid is thread-safe, I reviewed the source code and it seems to be thread-safe.
    • Additionally, I found a discussion where the author of libuuid explains why they introduced THREAD_LOCAL while addressing a threading bug. I checked the current version of libuuid, and that threading bug has been fixed.
    • The Solaris man page also marks these functions as MT-Safe—this is the only explicit mention of thread-safety I could find.
    • I also implemented a test to generate UUIDs using libuuid and ran it repeatedly on Linux (with 48 threads) and macOS (with 10 threads) for several hours. I wasn’t able to produce any collisions or observe any threading issues. On Linux, I tested both with and without the uuidd daemon running.

    Based on all of the above, I changed the uuid module to release the GIL for these libuuid functions, since everything I found indicates they are thread-safe. I did come across some old reported issues, but those have either been fixed in the library itself or addressed in the distributions. However, if there are still concerns about thread safety, to be on the safe side we can remove Py_{BEGIN,END}_ALLOW_THREADS and add a mutex to protect these functions in the free-threading build.

  2. FreeBSD, OpenBSD and AIX (uuid_create())

    • Unfortunately, the AIX documentation doesn’t mention anything about thread safety, and I don’t have access to an AIX machine to test it myself. So, the situation for AIX is unclear.
    • I checked the recent version of the FreeBSD code, and uuid_create() calls the uuidgen() system call which uses a uuid_mutex for synchronization. So, it should be thread-safe.

    I changed the uuid module to release the GIL for create_uuid(). However, the situation on AIX is still unclear, and I’m not sure if there are any other systems that use create_uuid(). It might be safer to remove Py_{BEGIN,END}_ALLOW_THREADS and protect create_uuid() with a mutex in the free-threading build.


Finally, just a note regarding libuuid: Even though the functions seem thread-safe, that doesn’t always mean the UUIDs will be unique. To guarantee uniqueness of UUIDs, libuuid first tries to use a global clock state file (if the process has permission) and/or the uuidd daemon (if it’s running or can be started). If those aren’t available, it falls back to getrandom(), then /dev/{u,}random, and finally rand(). In the worst case, it’s theoretically possible for two processes running at the same time to generate the same UUID.


cc: @mpage @colesbury @Yhg1s @kumaraditya303 @vstinner

@yoney yoney marked this pull request as ready for review October 13, 2025 23:21
ashm-dev added a commit to ashm-dev/cpython that referenced this pull request Oct 14, 2025
The memcpy() call in _PyInterpreterState_New() was overwriting the
_malloced pointer that was set by alloc_interpreter(), causing a memory
leak when subinterpreters were destroyed.

Fixed by preserving the _malloced pointer across the memcpy().
@vstinner
Copy link
Member

cc @picnixz

@picnixz picnixz self-requested a review October 14, 2025 11:09
@mpage
Copy link
Contributor

mpage commented Oct 14, 2025

First of all, thanks a bunch for the detailed and thorough research!

Based on all of the above, I changed the uuid module to release the GIL for these libuuid functions, since everything I found indicates they are thread-safe ... However, if there are still concerns about thread safety, to be on the safe side we can remove Py_{BEGIN,END}_ALLOW_THREADS and add a mutex to protect these functions in the free-threading build.

I would lean towards not modifying the existing behavior in the with-GIL builds (either by adding a mutex in the free-threaded build, or by leaving the code unmodified, thereby allowing the builds to diverge). Curious to see what others think.

@colesbury
Copy link
Contributor

I would lean towards not modifying the existing behavior in the with-GIL builds (either by adding a mutex in the free-threaded build, or by leaving the code unmodified, thereby allowing the builds to diverge).

Yes, I agree

@kumaraditya303
Copy link
Contributor

I would lean towards not modifying the existing behavior in the with-GIL builds (either by adding a mutex in the free-threaded build, or by leaving the code unmodified, thereby allowing the builds to diverge).

Agreed, also if the functions do not block then releasing and acquiring the GIL would cause a slowdown in the gil-enabled build.

@yoney yoney changed the title gh-116738: Make uuid module thread-safe gh-116738: Test uuid module for free threading Oct 15, 2025
@yoney
Copy link
Contributor Author

yoney commented Oct 15, 2025

@mpage @colesbury @kumaraditya303 thanks for the comments!

I’ve removed the Py_{BEGIN,END}_ALLOW_THREADS lines and left everything else as it was. With this approach, we’re saying we believe these three functions are thread-safe, so there’s no need to add extra locking. This also covers subinterpreters, so we don’t need to do anything special for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants