[#2537] feat(spark): Introduce option to activate small cache in grpc server#2538
[#2537] feat(spark): Introduce option to activate small cache in grpc server#2538zuston merged 3 commits intoapache:masterfrom
Conversation
|
@rickyma Could you take a look at this PR? |
| .booleanType() | ||
| .defaultValue(false) | ||
| .withDescription( | ||
| "The option to control whether enable the pooled byte buf allocator small cache. This is only valid for spark driver side grpc server"); |
There was a problem hiding this comment.
Option to control whether to enable the small cache in the pooled byte buffer allocator. This option is only applicable to the gRPC server on the Spark driver side.
There was a problem hiding this comment.
Also, we need to describe this new config in docs.
| } | ||
|
|
||
| private Server buildGrpcServer(int serverPort) { | ||
| boolean isClientSmallCacheEnabled = |
There was a problem hiding this comment.
Maybe we can put this config into RssBaseConf also? Because all other configs are in it.
In this way, we should rename it to isSmallCacheEnabled thus it could be used both in clients and servers?
Then, we need to change the description of it as well.
There was a problem hiding this comment.
Sounds good. This option could be extended to cover the #1780 requirements.
|
PTAL @rickyma |
common/src/main/java/org/apache/uniffle/common/config/RssBaseConf.java
Outdated
Show resolved
Hide resolved
|
Thanks @rickyma @jerqi .Merged |
…n grpc server (apache#2538) ### What changes were proposed in this pull request? Introduce the config option to activate small cache in grpc server ### Why are the changes needed? for apache#2537 When partition reassignment is enabled in the production environment, we observed that some Spark jobs failed due to gRPC request timeouts (DEADLINE_EXCEEDED). Upon investigating the Spark driver logs, we found severe GC events, indicating significant memory pressure on the driver process. Based on the PR apache#1780, the small cache looks effective for the grpc mode. This PR is to make the small cache being enabled as the default option because GRPC_NETTY mode has been as the default rpc mode. ### Does this PR introduce _any_ user-facing change? Yes. `rss.rpc.netty.smallCacheEnabled=true` ### How was this patch tested? Existing unit tests.
What changes were proposed in this pull request?
Introduce the config option to activate small cache in grpc server
Why are the changes needed?
for #2537
When partition reassignment is enabled in the production environment, we observed that some Spark jobs failed due to gRPC request timeouts (DEADLINE_EXCEEDED). Upon investigating the Spark driver logs, we found severe GC events, indicating significant memory pressure on the driver process.
Based on the PR #1780, the small cache looks effective for the grpc mode.
This PR is to make the small cache being enabled as the default option because GRPC_NETTY mode has been as the default rpc mode.
Does this PR introduce any user-facing change?
Yes.
rss.rpc.netty.smallCacheEnabled=trueHow was this patch tested?
Existing unit tests.