Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spml/ucx: shuffle EPs creation #12907

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

michal-shalev
Copy link

What?

This PR randomize the order in which endpoints (EPs) are created in the mca_spml_ucx_add_procs function. Each new EP is placed at a random position instead of cyclic order.

Why?

Recently, a customer raised concerns about incast behavior during SHMEM quiet operations, where many EPs communicating in a fixed order could lead to network congestion and performance degradation. They believe that randomizing the order of EPs could reduce the likelihood of incast collisions.
The customer is interested in testing a patch to address this.

How?

Used the Fisher-Yates shuffle algorithm to randomize indices and modified the loop to handle all indices, including 0, within a single pass.

Copy link

github-actions bot commented Nov 4, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

b734683: spml/ucx: shuffle EPs creation

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

github-actions bot commented Nov 4, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

4d562ac: spml/ucx: shuffle EPs creation

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

github-actions bot commented Nov 4, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

2345c1f: spml/ucx: shuffle EPs creation

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

github-actions bot commented Nov 4, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

489dbcf: spml/ucx: shuffle EPs creation

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

github-actions bot commented Nov 4, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

eb303bf: spml/ucx: shuffle EPs creation

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

github-actions bot commented Nov 4, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

25c6b86: spml/ucx: shuffle EPs creation

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Copy link

github-actions bot commented Nov 4, 2024

Hello! The Git Commit Checker CI bot found a few problems with this PR:

893a73e: spml/ucx: shuffle EPs creation

  • check_signed_off: does not contain a valid Signed-off-by line

Please fix these problems and, if necessary, force-push new commits back up to the PR branch. Thanks!

Signed-off-by: Michal Shalev <mshalev.nvidia.com>
Copy link
Contributor

@tvegas1 tvegas1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

@@ -691,23 +691,40 @@ int mca_spml_ucx_add_procs(oshmem_group_t* group, size_t nprocs)
}
}

indices = malloc(nprocs * sizeof(int));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: sizeof(*indices)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

/* Get the EP connection requests for all the processes from modex */
for (n = 0; n < nprocs; ++n) {
i = (my_rank + n) % nprocs;
for (i = nprocs - 1; i >= 0; --i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe perform randomization as a separate function/loop

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is to iterate over the EPs once, and to save another iteration

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok for me, but i guess one extra loop on sequential memory on ep creation would not impact perf


err = ucp_ep_create(mca_spml_ucx_ctx_default.ucp_worker[0], &ep_params,
&mca_spml_ucx_ctx_default.ucp_peers[i].ucp_conn);
&mca_spml_ucx_ctx_default.ucp_peers[indices[i]].ucp_conn);
if (UCS_OK != err) {
SPML_UCX_ERROR("ucp_ep_create(proc=%zu/%zu) failed: %s", n, nprocs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also update log line index (i and indices[i])

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Notice I'm using proc_index instead of i.
I changed n to proc_index, why indices[proc_index]? the loop still iterates until nproc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah right, already up-to-date. we could add indices[proc_index] to the log along with iteration since it is randomized.

oshmem/mca/spml/ucx/spml_ucx.c Show resolved Hide resolved
Signed-off-by: Michal Shalev <mshalev.nvidia.com>
Signed-off-by: Michal Shalev <mshalev.nvidia.com>
@@ -644,6 +643,8 @@ int mca_spml_ucx_add_procs(oshmem_group_t* group, size_t nprocs)
ucp_address_t **wk_local_addr;
unsigned int *wk_addr_len;
ucp_ep_params_t ep_params;
int *indices;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsigned int?

@@ -691,23 +692,40 @@ int mca_spml_ucx_add_procs(oshmem_group_t* group, size_t nprocs)
}
}

indices = malloc(nprocs * sizeof(*indices));
if (!indices) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems the convention is compare to NULL

indices[i] = i;
}

srand((unsigned int)time(NULL));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO can omit (unsigned int)

swap_index = rand() % (proc_index + 1);
temp_index = indices[proc_index];
indices[proc_index] = indices[swap_index];
indices[proc_index] = temp_index;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indices[swap_index] = temp_index;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a fix for it

spml/ucx: fix swap
Signed-off-by: Michal Shalev <mshalev.nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants