Skip to content

Conversation

@viclafargue
Copy link
Contributor

@viclafargue viclafargue commented Nov 8, 2024

Introduces the raft::device_resources_snmg type to hold all resources required for the NCCL clique.

Answers #2459
Removed call to raft::comms::build_comms_nccl_only (#2465)

@codecov-commenter
Copy link

codecov-commenter commented Nov 8, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.16%. Comparing base (d7e68f5) to head (9780ca5).

Additional details and impacted files
@@              Coverage Diff              @@
##           branch-25.02    #2487   +/-   ##
=============================================
  Coverage         81.16%   81.16%           
=============================================
  Files                19       19           
  Lines               515      515           
=============================================
  Hits                418      418           
  Misses               97       97           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@viclafargue viclafargue changed the title NCCL clique initialization function Introduction of the raft::device_resources_snmg type Nov 15, 2024
Copy link
Member

@cjnolet cjnolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took another look and I think the implementation itself is great. The only suggestions I have are related to the docs. I understand this is a lower-level detail for most users but we should strive to provide as much info and details as possible for both end-users AND for future implementors of SNMG algorithms.

@cjnolet cjnolet changed the base branch from branch-24.12 to branch-25.02 December 12, 2024 00:22
@cjnolet
Copy link
Member

cjnolet commented Jan 16, 2025

/merge

@rapids-bot rapids-bot bot merged commit fb6bfe6 into branch-25.02 Jan 16, 2025
71 checks passed
@bdice
Copy link
Contributor

bdice commented Jan 17, 2025

@viclafargue @cjnolet This nccl_clique.hpp header removal appears to have broken cuVS builds: https://github.com/rapidsai/cuvs/actions/runs/12819602008/job/35747622539?pr=567#step:8:991

I don't see any cuVS PRs with a fix yet.

@cjnolet
Copy link
Member

cjnolet commented Jan 17, 2025

@bdice this is the pr with the fix, I'm just trying to get it through CI: rapidsai/cuvs#454

cjnolet added a commit to cjnolet/raft that referenced this pull request Jan 17, 2025
rapids-bot bot pushed a commit that referenced this pull request Jan 17, 2025
…" (#2543)

This reverts commit fb6bfe6.

Authors:
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Divye Gala (https://github.com/divyegala)

URL: #2543
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change

Development

Successfully merging this pull request may close these issues.

5 participants