Skip to content

Conversation

@Ray0907
Copy link

@Ray0907 Ray0907 commented Nov 2, 2025

Add OpenMP parallelization to compute_residuals function.
This addresses the TODO comment about parallelization.

Changes:

  • Add #pragma omp parallel for to residual computation loop
  • Use if(n > 1000) to avoid overhead for small batches
  • Each iteration is independent and thread-safe

  Add OpenMP parallelization to compute_residuals function.
  This addresses the TODO comment about parallelization.

  Changes:
  - Add #pragma omp parallel for to residual computation loop
  - Use if(n > 1000) to avoid overhead for small batches
  - Each iteration is independent and thread-safe
@meta-cla meta-cla bot added the CLA Signed label Nov 2, 2025
@meta-codesync
Copy link
Contributor

meta-codesync bot commented Nov 4, 2025

@limqiying has imported this pull request. If you are a Meta employee, you can view this in D86234690.

@mnorris11
Copy link

Hi @Ray0907 did you try benchmarking it? Do you observe speedup? If yes, can you paste the script you used here?

  MSVC requires signed integral type for OpenMP loop variables.
  Changed from size_t to idx_t in compute_residuals().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants