Skip to content

Low bandwidth with 32 EFA NICs #10658

Answered by abcdabcd987
abcdabcd987 asked this question in Q&A
Discussion options

You must be logged in to vote

Ah. Finally able to reach close to full bandwidth! Thanks for the help! @shijin-aws

Perhaps the most important things are:

  1. Use 1 CPU per GPU (4 NICs).
  2. Pin thread to a CPU core!!!! This gives the most significant boost.
  3. Send one warmup message before starting benchmark. Establishing connection is lazy and takes time. This gives another significant boost.
  4. Interleave op submission and cq polling.

Updated code: https://gist.github.com/abcdabcd987/ad02c376b60acedbca8a1f7c635fbf7f

Performance Numbers (64KiB message size):

GPUs NICs Bandwidth Util Packet/s
1 1 97.821 Gbps 97.8% 0.187 Mpps
2 2 195.565 Gbps 97.8% 0.373 Mpps
4 4 390.984 Gbps 97.7% 0.746 Mpps
8 8 782.438 Gbps 97.8% 1.…

Replies: 8 comments 17 replies

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
2 replies
@abcdabcd987
Comment options

@shijin-aws
Comment options

Comment options

You must be logged in to vote
2 replies
@shijin-aws
Comment options

@shijin-aws
Comment options

Comment options

You must be logged in to vote
7 replies
@shijin-aws
Comment options

@abcdabcd987
Comment options

@shijin-aws
Comment options

@shijin-aws
Comment options

@abcdabcd987
Comment options

Comment options

You must be logged in to vote
5 replies
@shijin-aws
Comment options

@abcdabcd987
Comment options

@shijin-aws
Comment options

@shijin-aws
Comment options

@abcdabcd987
Comment options

Comment options

You must be logged in to vote
0 replies
Answer selected by abcdabcd987
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@abcdabcd987
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
2 participants