Skip to content

gh-91349: Expose the crc32 function from the lzma library #131721

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

morotti
Copy link
Contributor

@morotti morotti commented Mar 25, 2025

Hello,

This is a simple PR to expose the crc32 function from the lzma library.
The code and tests are very similar to the crc in binascii and zlib. I don't think it's too controversial.

I came across this while looking at compression libraries for #91349
crc32 is always available from the lzma library (xz-utils/liblzma.so). I think it should be exposed.
there are more hashing functions available (namely crc64) but they need compilation/runtime checks to verify whether they are available.

Regards.


📚 Documentation preview 📚: https://cpython-previews--131721.org.readthedocs.build/

@morotti morotti force-pushed the rmorotti-lzma-crc32 branch from 1b298bd to ad43096 Compare April 17, 2025 13:32
@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

sorry for delay,
@merwok @AA-Turner I started over from the main branch and made all the adjustments.

all the docstring/comments were copy pasted from the zlib function. I guess the old function had an old style that is no longer wanted.

for the return type in the docstring, we can settle with "positive integer"

https://docs.python.org/3/library/zlib.html#zlib.crc32
the docstring on zlib.crc32 says "The result is an unsigned 32-bit integer." but there is no concept of unsigned or 32 bits for python int.

Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

morotti and others added 3 commits April 17, 2025 14:53
Co-authored-by: Adam Turner <[email protected]>
Co-authored-by: Adam Turner <[email protected]>
Co-authored-by: Adam Turner <[email protected]>
@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

@AA-Turner , I merged all your suggestions.

for the versionadded: next, I thought changes on main would ship in 3.14? I am too late for 3.14 release?

@AA-Turner
Copy link
Member

'next' is updated by the Release Manager for each version, it's a convenience tool that we use internally. We can still get this in for 3.14, yes.

A

@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

regarding performance, don't mind that for now, I will come back later.

in principle

  • zlib is abandoned and the slowest.
  • liblzma is 2 or 3 times faster.
  • zlib-ng is two orders of magnitude faster. it's using hardware instructions.

liblzma had code to start using hardware crc instructions soon ... and I see they just made a new stable release few days ago https://github.com/tukaani-project/xz/releases
I will check what's better and come back with more PR to use the best available crc function.

@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

it appears the Ubuntu (free-threading) / build and test (ubuntu-24.04-arm is flaky

@AA-Turner
Copy link
Member

I'll re-run it.

@morotti
Copy link
Contributor Author

morotti commented Apr 17, 2025

build passed, ready to merge

@AA-Turner
Copy link
Member

Do you want to include your man.com email in the commit? Currently GH lists it as a co-authored-by, but the commit itself will be under your GMail address.

@morotti
Copy link
Contributor Author

morotti commented Apr 18, 2025

Please don't mind the email.
My work and my personal emails can both show up depending how the commits and PRs are opened. It doesn't matter which one shows up. My organization signed the CLA for the cpython interpreter, I have permissions to contribute.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants