Skip to content

Implement HyKKT Ruiz Scaling #317

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 21 commits into
base: hykkt-dev
Choose a base branch
from
Draft

Implement HyKKT Ruiz Scaling #317

wants to merge 21 commits into from

Conversation

adhamsi
Copy link
Collaborator

@adhamsi adhamsi commented Jun 23, 2025

Description

A module ruiz is added implementing the RuizScaler class to be used for HyKKT.

Closes #330.

Proposed changes

Implements RuizScaler and RuizScalerKernelImpl in CPU, HIP, and CUDA. Recreates the original test. RuizScaler methods take in matrix::Csr and vector::Vector object types to set the matrix and vector data and implements the scale method to perform the scaling in-place.

Checklist

  • All tests pass. Code tested on
    • CPU backend
    • CUDA backend
    • HIP backend
  • Code compiles cleanly with flags -Wall -Wpedantic -Wconversion -Wextra.
  • The new code follows Re::Solve style guidelines.
  • There are unit tests for the new code.
  • The new code is documented.
  • The feature branch is rebased with respect to the target branch.

Further comments

@adhamsi adhamsi marked this pull request as ready for review June 26, 2025 15:50
@adhamsi adhamsi force-pushed the adham/hykkt-ruiz branch from e98f39a to f8257d8 Compare June 26, 2025 20:08
Copy link
Collaborator

@shakedregev shakedregev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting a segfault for CUDA. Also, fix merge conflicts.

@shakedregev
Copy link
Collaborator

shakedregev commented Jun 27, 2025

Ruiz tests are working, but let's fix the perm tests that made it in here.

Copy link
Collaborator

@pelesh pelesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • It would be good to design more intuitive and easy to verify unit tests.
  • More extensive documentation is needed for the class methods.
  • Matrix/vector objects probably do not need to be unpacked except for passing input to GPU kernels.

Comment on lines 133 to 134
void RuizScalingKernelsCUDA::adaptDiagScale(index_type n_hes, index_type n_total, index_type* hes_i, index_type* hes_j, real_type* hes_v, index_type* jac_i, index_type* jac_j, real_type* jac_v, index_type* jac_tr_i, index_type* jac_tr_j, real_type* jac_tr_v, real_type* rhs1, real_type* rhs2, real_type* aggregate_scaling_vector, real_type* scaling_vector)
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguments of this function should be Re::Solve vectors and matrices. I believe you need to unpack them into raw data arrays only before sending them to GPU kernels.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would mean at the beginning of the wrappers there would be something like

const index_type* hes_i = hes->getRowData();
const index_type* hes_j = hes->getColData();
const real_type* hes_v = hes->getValues();
const index_type* jac_i = jac->getRowData();
const index_type* jac_j = jac->getColData();
const real_type* jac_v = jac->getValues();
const index_type* jac_tr_i = jac_tr->getRowData();
const index_type* jac_tr_j = jac_tr->getColData();
const real_type* jac_tr_v = jac_tr->getValues();

so it would be a bit verbose to put this in every kernel implementation. Currently, the unpacking occurs as soon as the data is passed in to the RuizScaler top-level object so it is done once. Implementing it this way would require the unpacking occur repeatedly for every iteration of scaling.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the raw pointers should be accessed only in the kernel call. So calling the kernel should look something like this:

  void RuizScalingKernelsHIP::adaptDiagScale(matrix::Sparse* hes,
                                             // more arguments ...
                                             )
    {
      int block_size = 256;
      int num_blocks = (n_total + block_size - 1) / block_size;
      kernels::adaptDiagScale<<<num_blocks, block_size>>>(hes->getNumRows(), 
                                                          hes->getRowData(),
                                                          hes->getColData(), 
                                                          // more arguments ...
                                                          );
    }

Ideally, matrix/vector objects would be passed into the kernel itself, but CUDA/HIP kernels support only fundamental types and pointers to them.

Comment on lines +89 to +94
if (fabs(H->getValues(memory::HOST)[n / 2 - 1] - 0.062378167641326) > tol)
{
test_passed = false;
std::cout << "Test failed: H[n/2-1][n/2-1] = " << H->getValues(memory::HOST)[n / 2 - 1]
<< ", expected " << 0.062378167641326 << "\n";
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests like this a typically fragile. Consider designing tests that are more intuitive and easier to verify.

This test essentially checks if the behavior of Ruiz scaling has changed. It says little about Ruiz scaling correctness.

Comment on lines +40 to +42

TestOutcome ruizTest()
{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having some documentation here as to what is being tested here and how would be helpful.

@adhamsi adhamsi force-pushed the adham/hykkt-ruiz branch from fde1d6f to a8e4039 Compare June 30, 2025 14:05
@shakedregev shakedregev marked this pull request as draft July 2, 2025 20:22
@shakedregev
Copy link
Collaborator

Still need to fix the merge conflicts and compile with -D RESOLVE_USE_ASAN=ON to catch memory leaks. Then run the tests normally. If all of them pass, there's no leaks.

Copy link
Collaborator

@pelesh pelesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding couple of comments related to my earlier review.

@adhamsi
Copy link
Collaborator Author

adhamsi commented Jul 3, 2025

Still need to fix the merge conflicts and compile with -D RESOLVE_USE_ASAN=ON to catch memory leaks. Then run the tests normally. If all of them pass, there's no leaks.

Leaks have now been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants