Skip to content

Improve hardware feature detection & consolidate duplicated settings in CMake files #824

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

mhucka
Copy link
Collaborator

@mhucka mhucka commented Jun 29, 2025

Changes:

  • Previously, the CMake configuration was such that it would always build the AVX2, AVX512 & SSE2 Pybind11 modules without testing whether the current system supported those options. The changes here use CMake features to detect whether the architectural features are in fact available on the host computer, and only attempt to build the appropriate modules.

  • Previously, each of 8 or so pybind_interface/ subdirectories had CMakeLists.txt files that contained the same text for setting certain compilation flags. This PR removes the duplication in favor of putting the settings into the top-level CMake file.

  • Finally, there is a minor bit of refactoring to the top level CMakeLists.txt file to group some of the code more logically, and to add some more informational message printouts.

mhucka added 7 commits June 29, 2025 22:52
Previously, the CMake configuration was such that it would always
build the AVX2, AVX512 & SSE2 Pybind11 modules without testing whether
the current system supported those options. The changes here use CMake
features to detect whether the architectural features are in fact
available, and only attempt to build the appropriate modules.

In addition, there is a minor bit of refactoring to group some of the
code more logically, and to add some more informational message
printouts.
Each of 8 or so `pybind_interface/` subdirectories had
`CMakeLists.txt` files that contained the same text for setting
certain compilation flags. This commit removes the duplication in
favor of putting the settings into the top-level CMake file.
On Ubuntu, one sees warnings like this:

```
lto-wrapper: warning: using serial compilation of 13 LTRANS jobs
lto-wrapper: note: see the ‘-flto’ option documentation for more
information
lto-wrapper: warning: using serial compilation of 15 LTRANS jobs
lto-wrapper: note: see the ‘-flto’ option documentation for more
information
lto-wrapper: warning: using serial compilation of 16 LTRANS jobs
lto-wrapper: note: see the ‘-flto’ option documentation for more
information
lto-wrapper: warning: using serial compilation of 16 LTRANS jobs
lto-wrapper: note: see the ‘-flto’ option documentation for more
information
```

This seems to be the default behavior if the option `-flto` is not
given a value (c.f. https://stackoverflow.com/a/72222512/28972686).
Giving it a value of "auto" makes the warning go away and lets the
compilation toolchain decide how much parallism it can use.
@mhucka mhucka changed the title Improve hardware feature detection & conslidate duplicated settings in CMake files Improve hardware feature detection & consolidate duplicated settings in CMake files Jun 30, 2025
mhucka added 2 commits June 30, 2025 03:47
This can enable additional optimizations without requiring specific
avx/sse/etc. instructions.
@mhucka mhucka force-pushed the mh-consolidate-cmake-configs branch from 1b753ba to 3941a3f Compare June 30, 2025 03:51
The common wisdom is wrong, apparently: `-march=native` does not work
for AVX on MacOS.
@mhucka mhucka force-pushed the mh-consolidate-cmake-configs branch from 1756d01 to 20c9dd8 Compare July 1, 2025 03:37
@mhucka mhucka self-assigned this Jul 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant