[#2506] feat(spark3): Introduce option to enable reorder multi servers for reader #2507

zuston · 2025-06-18T04:07:51Z

What changes were proposed in this pull request?

Introduce option to enable reorder multi servers for reader

Why are the changes needed?

If partition splitting is enabled, large partitions will be distributed across multiple shuffle servers. With the help of Spark AQE (Adaptive Query Execution), these large partitions will be processed by multiple tasks.

In this case, all split tasks may sequentially read from the same set of shuffle servers, which can lead to high RPC concurrency pressure on specific servers.

This PR introduces the ability to randomly reorder the underlying shuffle servers to achieve better load balancing during reading.

Does this PR introduce any user-facing change?

Yes.

rss.client.read.reorderMultiServersEnable=false

How was this patch tested?

Needn't

…servers for reader

github-actions · 2025-06-18T04:34:45Z

Test Results

3 049 files ±0 3 049 suites ±0 6h 48m 33s ⏱️ +53s
1 186 tests ±0 1 185 ✅ ±0 1 💤 ±0 0 ❌ ±0
15 042 runs ±0 15 027 ✅ ±0 15 💤 ±0 0 ❌ ±0

Results for commit a5f48f9. ± Comparison against base commit 22be628.

…servers for reader (apache#2507) ### What changes were proposed in this pull request? Introduce option to enable reorder multi servers for reader ### Why are the changes needed? If partition splitting is enabled, large partitions will be distributed across multiple shuffle servers. With the help of Spark AQE (Adaptive Query Execution), these large partitions will be processed by multiple tasks. In this case, all split tasks may sequentially read from the same set of shuffle servers, which can lead to high RPC concurrency pressure on specific servers. This PR introduces the ability to randomly reorder the underlying shuffle servers to achieve better load balancing during reading. ### Does this PR introduce _any_ user-facing change? Yes. `rss.client.read.reorderMultiServersEnable=false ` ### How was this patch tested? Needn't

[apache#2506] feat(spark3): Introduce option to enable reorder multi …

a5f48f9

…servers for reader

zuston requested a review from jerqi June 18, 2025 04:08

jerqi approved these changes Jun 18, 2025

View reviewed changes

zuston merged commit a69930d into apache:master Jun 18, 2025
41 checks passed

zuston deleted the readRandom branch June 18, 2025 09:19

zuston mentioned this pull request Jun 18, 2025

[FEATURE] Random multi shuffle servers for reader #2506

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[#2506] feat(spark3): Introduce option to enable reorder multi servers for reader #2507

[#2506] feat(spark3): Introduce option to enable reorder multi servers for reader #2507

Uh oh!

zuston commented Jun 18, 2025

Uh oh!

github-actions bot commented Jun 18, 2025

Uh oh!

Uh oh!

Uh oh!

[#2506] feat(spark3): Introduce option to enable reorder multi servers for reader #2507

[#2506] feat(spark3): Introduce option to enable reorder multi servers for reader #2507

Uh oh!

Conversation

zuston commented Jun 18, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jun 18, 2025

Test Results

Uh oh!

Uh oh!

Uh oh!