Skip to content

Comments

Implement HyKKT Ruiz Scaling#317

Merged
shakedregev merged 34 commits intohykkt-devfrom
adham/hykkt-ruiz
Jul 25, 2025
Merged

Implement HyKKT Ruiz Scaling#317
shakedregev merged 34 commits intohykkt-devfrom
adham/hykkt-ruiz

Conversation

@adhamsi
Copy link
Collaborator

@adhamsi adhamsi commented Jun 23, 2025

Description

A module ruiz is added implementing the RuizScaler class to be used for HyKKT.

Closes #330.

Proposed changes

Implements RuizScaler and RuizScalerKernelImpl in CPU, HIP, and CUDA. Recreates the original test. RuizScaler methods take in matrix::Csr and vector::Vector object types to set the matrix and vector data and implements the scale method to perform the scaling in-place.

Checklist

  • All tests pass. Code tested on
    • CPU backend
    • CUDA backend
    • HIP backend
  • Code compiles cleanly with flags -Wall -Wpedantic -Wconversion -Wextra.
  • The new code follows Re::Solve style guidelines.
  • There are unit tests for the new code.
  • The new code is documented.
  • The feature branch is rebased with respect to the target branch.

Further comments

@adhamsi adhamsi marked this pull request as ready for review June 26, 2025 15:50
@adhamsi adhamsi force-pushed the adham/hykkt-ruiz branch from e98f39a to f8257d8 Compare June 26, 2025 20:08
Copy link
Collaborator

@shakedregev shakedregev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting a segfault for CUDA. Also, fix merge conflicts.

@shakedregev
Copy link
Collaborator

shakedregev commented Jun 27, 2025

Ruiz tests are working, but let's fix the perm tests that made it in here.

Copy link
Collaborator

@pelesh pelesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • It would be good to design more intuitive and easy to verify unit tests.
  • More extensive documentation is needed for the class methods.
  • Matrix/vector objects probably do not need to be unpacked except for passing input to GPU kernels.

@adhamsi adhamsi force-pushed the adham/hykkt-ruiz branch from fde1d6f to a8e4039 Compare June 30, 2025 14:05
@shakedregev shakedregev marked this pull request as draft July 2, 2025 20:22
@shakedregev
Copy link
Collaborator

Still need to fix the merge conflicts and compile with -D RESOLVE_USE_ASAN=ON to catch memory leaks. Then run the tests normally. If all of them pass, there's no leaks.

pelesh
pelesh previously requested changes Jul 2, 2025
Copy link
Collaborator

@pelesh pelesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding couple of comments related to my earlier review.

@adhamsi
Copy link
Collaborator Author

adhamsi commented Jul 3, 2025

Still need to fix the merge conflicts and compile with -D RESOLVE_USE_ASAN=ON to catch memory leaks. Then run the tests normally. If all of them pass, there's no leaks.

Leaks have now been fixed.

@adhamsi adhamsi marked this pull request as ready for review July 7, 2025 17:23
Copy link
Collaborator

@superwhiskers superwhiskers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some minor suggestions

Comment on lines +40 to +42
vector::Vector* scaling_vector_; // Scaling vector data for the current iteration
vector::Vector* aggregate_scaling_vector_; // Cumulative scaling vector
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these two vectors don't seem like they need to be allocated on the heap

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it doesn't seem like this was resolved?

Copy link
Collaborator

@shakedregev shakedregev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long functions should have one argument on each line. Fix merge conflicts. Make sure only the first letter is capital in class names such as Cpu, Cuda, Hip.

@adhamsi
Copy link
Collaborator Author

adhamsi commented Jul 9, 2025

Long functions should have one argument on each line. Fix merge conflicts. Make sure only the first letter is capital in class names such as Cpu, Cuda, Hip.

Can we have the clang formatter handle the long functions?

@superwhiskers
Copy link
Collaborator

it can, but only if we specified a line length limit, which is something @pelesh didn't want, so you have to manually add the break and then format it again, which will preserve your choice.

@pelesh
Copy link
Collaborator

pelesh commented Jul 9, 2025

Long functions should have one argument on each line. Fix merge conflicts. Make sure only the first letter is capital in class names such as Cpu, Cuda, Hip.

Can we have the clang formatter handle the long functions?

If you break after first function argument clang-format will do the rest for you. In most cases, the editor will do the alignment for you.

We don't enforce line length because we want to have better control of how we code mathematical expressions for readability.

@adhamsi adhamsi force-pushed the adham/hykkt-ruiz branch from f83aeaf to 988ffec Compare July 10, 2025 12:50
@adhamsi adhamsi marked this pull request as draft July 10, 2025 13:50
@adhamsi adhamsi force-pushed the adham/hykkt-ruiz branch from 1d89d24 to a02f7a7 Compare July 11, 2025 13:43
@adhamsi adhamsi marked this pull request as ready for review July 11, 2025 13:45
@adhamsi adhamsi requested a review from shakedregev July 11, 2025 13:59
@adhamsi adhamsi force-pushed the adham/hykkt-ruiz branch from 10b6743 to 28024a8 Compare July 25, 2025 15:55
@shakedregev shakedregev dismissed stale reviews from superwhiskers, pelesh, and themself July 25, 2025 16:04

Stale

Copy link
Collaborator

@shakedregev shakedregev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validate the values in J^T

@shakedregev shakedregev merged commit eb042f2 into hykkt-dev Jul 25, 2025
9 of 11 checks passed
@shakedregev shakedregev deleted the adham/hykkt-ruiz branch July 25, 2025 19:32
superwhiskers added a commit that referenced this pull request Jul 27, 2025
shakedregev pushed a commit that referenced this pull request Jul 28, 2025
* address the left over comments in #317

* Apply pre-commmit fixes

---------

Co-authored-by: superwhiskers <[email protected]>
nkoukpaizan pushed a commit that referenced this pull request Sep 4, 2025
* Added the permutation class from hykkt, along with tests.

* HyKKT GPU Permutation Kernels (Attempt 2) (#333)

* Added the permutation class from hykkt, along with tests.

* refactor with base class for perm kernels

* add kernel handler to Permutation class to call appropriate functions

* add workspace parameter to test

* gpu implementations

* add PermutationHandler

* test for more rows than cols

* add documentation for permutation

* remove use of linalgworkspace and update permutation constructor

* mapIndex test

* update signatures

* memory handling and cosmetic changes

* all functions on host except mapIdx

* signature and fix leak

* Apply pre-commmit fixes

* remove unused variables

* remove unused files and members, fix build

* Apply pre-commmit fixes

* use resolve types

* Apply pre-commmit fixes

---------

Co-authored-by: shakedregev <[email protected]>
Co-authored-by: adhamsi <[email protected]>

* set devImpl to null if no GPU backend

* Added the permutation class from hykkt, along with tests.

* add ruiz scaler and handler

* refactor the interfaces and use existing MemoryHandler for reset

* update interface

* allocate/deallocate scaling vectors

* remove handler as middleman

* add empty cpu implementation and cmakelists

* fix cmakelists

* cpu implementation

* hip implementation

* use resolve matrix and vector types

* cuda implementation

* ruiz scaling test

* fix bugs in kernel implementations

* comments in implementation files

* fix imports and leak

* update comments

* Doxygen reference and rename class

* use object pointers and update all usage

* better ruiz test

* merge conflict

* remove unused MemoryHandler

* change long function names, file name capitalization, remove unused params

* include issue

* Apply pre-commmit fixes

* error handling in constructor if no gpu support available

* fix comment

* error handling in constructor

* reset workspace after each test

* Check for J_tr

---------

Co-authored-by: shakedregev <[email protected]>
Co-authored-by: adhamsi <[email protected]>
nkoukpaizan pushed a commit that referenced this pull request Sep 4, 2025
* address the left over comments in #317

* Apply pre-commmit fixes

---------

Co-authored-by: superwhiskers <[email protected]>
shakedregev added a commit that referenced this pull request Nov 13, 2025
* Added the permutation class from hykkt, along with tests.

* HyKKT GPU Permutation Kernels (Attempt 2) (#333)

* Added the permutation class from hykkt, along with tests.

* refactor with base class for perm kernels

* add kernel handler to Permutation class to call appropriate functions

* add workspace parameter to test

* gpu implementations

* add PermutationHandler

* test for more rows than cols

* add documentation for permutation

* remove use of linalgworkspace and update permutation constructor

* mapIndex test

* update signatures

* memory handling and cosmetic changes

* all functions on host except mapIdx

* signature and fix leak

* Apply pre-commmit fixes

* remove unused variables

* remove unused files and members, fix build

* Apply pre-commmit fixes

* use resolve types

* Apply pre-commmit fixes

---------

Co-authored-by: shakedregev <[email protected]>
Co-authored-by: adhamsi <[email protected]>

* set devImpl to null if no GPU backend

* Added the permutation class from hykkt, along with tests.

* add ruiz scaler and handler

* refactor the interfaces and use existing MemoryHandler for reset

* update interface

* allocate/deallocate scaling vectors

* remove handler as middleman

* add empty cpu implementation and cmakelists

* fix cmakelists

* cpu implementation

* hip implementation

* use resolve matrix and vector types

* cuda implementation

* ruiz scaling test

* fix bugs in kernel implementations

* comments in implementation files

* fix imports and leak

* update comments

* Doxygen reference and rename class

* use object pointers and update all usage

* better ruiz test

* merge conflict

* remove unused MemoryHandler

* change long function names, file name capitalization, remove unused params

* include issue

* Apply pre-commmit fixes

* error handling in constructor if no gpu support available

* fix comment

* error handling in constructor

* reset workspace after each test

* Check for J_tr

---------

Co-authored-by: shakedregev <[email protected]>
Co-authored-by: adhamsi <[email protected]>
shakedregev pushed a commit that referenced this pull request Nov 13, 2025
* address the left over comments in #317

* Apply pre-commmit fixes

---------

Co-authored-by: superwhiskers <[email protected]>
pelesh added a commit that referenced this pull request Nov 20, 2025
* Added the permutation class from hykkt, along with tests.

* Implement HyKKT Ruiz Scaling (#317)

* Implement the HyKKT Cholesky module (#350)

* SpGEMM for HyKKT (#366)

---------

Co-authored-by: superwhiskers <[email protected]>
Co-authored-by: Nicholson Koukpaizan <[email protected]>
Co-authored-by: Adham Ibrahim <[email protected]>
Co-authored-by: kswirydo <[email protected]>
Co-authored-by: shakedregev <[email protected]>
Co-authored-by: Slaven Peles <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants