-
Notifications
You must be signed in to change notification settings - Fork 161
[#1727] feat(server): Introduce local allocation buffer to store blocks in memory #2492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test Results 3 049 files + 30 3 049 suites +30 6h 47m 55s ⏱️ -21s Results for commit 497c524. ± Comparison against base commit b45e986. This pull request removes 4 and adds 12 tests. Note that renamed tests count towards both.
♻️ This comment has been updated with latest results. |
cc @frankliee |
@rickyma Could you help me review this pull request? |
server/src/main/java/org/apache/uniffle/server/ShuffleServerConf.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/uniffle/server/buffer/AbstractShuffleBuffer.java
Outdated
Show resolved
Hide resolved
@@ -226,9 +238,9 @@ public StatusCode registerBuffer( | |||
ShuffleServerMetrics.gaugeTotalPartitionNum.inc(); | |||
ShuffleBuffer shuffleBuffer; | |||
if (shuffleBufferType == ShuffleBufferType.SKIP_LIST) { | |||
shuffleBuffer = new ShuffleBufferWithSkipList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can add the class SkipListLabShuffleBuffer
and LinkedListLabShuffleBuffer
, and add ShuffleBufferFactory
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok for me
Do you mean to do it like below? I tried this solution and found that many methods need to be implemented. And the code looks inelegant.@jerqi |
The second way is ok for me, too. You can extract an interface |
Good to see this PR for the performance improvement. I will take a look in the later of this day |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good for this abstrction
common/src/main/java/org/apache/uniffle/common/util/ByteBufUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, cc @frankliee , Could you take a look? I will wait util tomorrow.
I am quite busy these days. But I will take a look at this later today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this, the comparison result is promising.
Left some minor comments, others lgtm.
|
||
abstract void allocateDataBuffer(); | ||
|
||
public int alloc(int size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the implementation and the method name doesn't seem aligned? Could you add some java doc and/or change the method name here?
server/src/main/java/org/apache/uniffle/server/buffer/lab/ChunkCreator.java
Outdated
Show resolved
Hide resolved
private Chunk createChunk(boolean pool, int size) { | ||
Chunk chunk; | ||
int id = chunkID.getAndIncrement(); | ||
assert id > 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seem redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so .. chunkID should always greater than zero?
Even if it's possible to overflow, we should use Preconditions.check instead of assert, which will be no-op if assertion if not enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible to overflow, let's use Preconditions.check.
server/src/main/java/org/apache/uniffle/server/buffer/lab/ChunkCreator.java
Outdated
Show resolved
Hide resolved
return dataLength == that.dataLength | ||
&& crc == that.crc | ||
&& blockId == that.blockId | ||
&& data.equals(that.data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change? I think at least we should keep dataLength and crc check?
common/src/main/java/org/apache/uniffle/common/ShufflePartitionedBlock.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/uniffle/server/ShuffleServerConf.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/apache/uniffle/server/buffer/ShuffleBufferManager.java
Outdated
Show resolved
Hide resolved
private Chunk currChunk; | ||
|
||
List<Integer> chunks = new LinkedList<>(); | ||
private final int maxAlloc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this field name is a bit of vague, could this be called as capacity or something similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It means when a block's size is too big (bigger than maxAlloc), it will not be allocated on LAB.
Gentle ping @advancedxy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment, others lgtm.
…apacityRatio (#2554) ### What changes were proposed in this pull request? Change the default value of chunkPoolCapacityRatio ### Why are the changes needed? The proportion and the frequency of small blocks is not high. If this value is set too high, it may cause off-heap memory overflow. Fix: #2492 ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Verify in production environment.
What changes were proposed in this pull request?
Introduce local allocation buffer to store blocks in memory.
Why are the changes needed?
Fix: #1727
Does this PR introduce any user-facing change?
set
rss.server.buffer.lab.enable
totrue
in server.confHow was this patch tested?
CI and verify in production environment