Skip to content

Add support for fully configurable partial-pass 3D kernels #572

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 53 commits into
base: develop
Choose a base branch
from

Conversation

eng-flavio-teixeira
Copy link
Contributor

@eng-flavio-teixeira eng-flavio-teixeira commented May 8, 2025

Add the ability to fully configure partial-pass 3D kernels in kernel-generator.py. This should remove the need for the remaining hard-coded 64x64x64 partial-pass parameters across the library.

The main changes include:

  • kernel-generator.py now has a list of CS_3D_PP kernels, where partial-pass parameters are fully configurable.
  • Stockham_gen and kernel-generator.py interaction mechanism via JSON needed to understand the concept of linked partial-pass kernels.
  • Function pool and function map key needed a new concept for the linked partial-pass kernels, which are now queryable in the pool.
  • New compute schemes specifically added for partial-pass (CS_3D_PP, CS_STOCKHAM_PP, and CS_STOCKHAM_PP_BLOCK_CC). These should simplify handling of nodes in the plan tree structure, and make them clearly distinct from CS_3D_RC nodes.

@eng-flavio-teixeira eng-flavio-teixeira changed the title Add support for fully configurable 3D partial-pass kernels Add support for fully configurable partial-pass 3D kernels May 28, 2025
@eng-flavio-teixeira eng-flavio-teixeira marked this pull request as ready for review May 28, 2025 20:55
@ROCmMathLibrariesBot
Copy link

ROCmMathLibrariesBot commented May 29, 2025

@@ -459,4 +476,82 @@ struct SimpleHash
}
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear from context what these new structs are for; would it be possible to add some code comments, please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, new commit is in with the comments.

@Kevonosdiaz
Copy link
Contributor

Haven't looked much into this PR yet, but it might be a good time to update the default value of the variable at

set(ROCFFT_FUNCTION_POOL_N 8 CACHE STRING "Number of files to split function_pool into for compilation")

to 16 or higher if the function pool is growing, since it might improve build times a bit.

@eng-flavio-teixeira
Copy link
Contributor Author

Haven't looked much into this PR yet, but it might be a good time to update the default value of the variable at

set(ROCFFT_FUNCTION_POOL_N 8 CACHE STRING "Number of files to split function_pool into for compilation")

to 16 or higher if the function pool is growing, since it might improve build times a bit.

Did you see any noticeable improvement going from 8 to 16? This PR does not add more kernels to the function pool, but the next one will (but not an unreasonable amount).

@Kevonosdiaz
Copy link
Contributor

Did you see any noticeable improvement going from 8 to 16? This PR does not add more kernels to the function pool, but the next one will (but not an unreasonable amount).

It slowed down my build time by a couple seconds when going from 8 to 16, and building just the rocfft-test target with ninja, and without ccache. When there are more kernels added then maybe it'll be worth checking again.

@eng-flavio-teixeira
Copy link
Contributor Author

Did you see any noticeable improvement going from 8 to 16? This PR does not add more kernels to the function pool, but the next one will (but not an unreasonable amount).

It slowed down my build time by a couple seconds when going from 8 to 16, and building just the rocfft-test target with ninja, and without ccache. When there are more kernels added then maybe it'll be worth checking again.

Sounds good, I will keep that in mind for the next PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants