-
Notifications
You must be signed in to change notification settings - Fork 13.7k
[FLINK-37298] Added Pluggable Components for BatchStrategy & BufferWrapper in AsyncSinkWriter. #26274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
hlteoh37
merged 10 commits into
apache:master
from
Poorvankbhatia:FLINK-37298-async_custom_batch
Apr 15, 2025
Merged
[FLINK-37298] Added Pluggable Components for BatchStrategy & BufferWrapper in AsyncSinkWriter. #26274
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
98ac791
[FLINK-37298] Add custom batch handling in AsyncSinkWriter.
Poorvankbhatia e7bc00a
License added for new files.
Poorvankbhatia d7e66a4
Review Comments Incorporated
Poorvankbhatia f3b25ea
Corrected comments based on review.
Poorvankbhatia 2a5c048
Comment correction
Poorvankbhatia 76b8ff9
Spotless check done
Poorvankbhatia cdab37f
Merge branch 'master' into FLINK-37298-async_custom_batch
Poorvankbhatia a01287d
Review comments incorporated for async custom batch
Poorvankbhatia b853d31
Removed Builder Pattern and corrected Test case.
Poorvankbhatia b17e933
Marked Deque constructor as deprecated.
Poorvankbhatia File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
90 changes: 90 additions & 0 deletions
90
...flink-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/Batch.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.flink.connector.base.sink.writer; | ||
|
||
import org.apache.flink.annotation.PublicEvolving; | ||
|
||
import java.io.Serializable; | ||
import java.util.List; | ||
|
||
/** | ||
* A container for the result of creating a batch of request entries, including: | ||
* | ||
* <ul> | ||
* <li>The actual list of entries forming the batch | ||
* <li>The total size in bytes of those entries | ||
* <li>The total number of entries in the batch | ||
* </ul> | ||
* | ||
* <p>Instances of this class are typically created by a {@link BatchCreator} to summarize which | ||
* entries have been selected for sending downstream and to provide any relevant metrics for | ||
* tracking, such as the byte size or the record count. | ||
* | ||
* @param <RequestEntryT> the type of request entry in this batch | ||
*/ | ||
@PublicEvolving | ||
Poorvankbhatia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
public class Batch<RequestEntryT extends Serializable> { | ||
|
||
/** The list of request entries in this batch. */ | ||
private final List<RequestEntryT> batchEntries; | ||
|
||
/** The total size in bytes of the entire batch. */ | ||
private final long sizeInBytes; | ||
|
||
/** The total number of entries in the batch. */ | ||
private final int recordCount; | ||
|
||
/** | ||
* Creates a new {@code Batch} with the specified entries, total size, and record count. | ||
* | ||
* @param requestEntries the list of request entries that form the batch | ||
* @param sizeInBytes the total size in bytes of the entire batch | ||
*/ | ||
public Batch(List<RequestEntryT> requestEntries, long sizeInBytes) { | ||
this.batchEntries = requestEntries; | ||
this.sizeInBytes = sizeInBytes; | ||
this.recordCount = requestEntries.size(); | ||
} | ||
|
||
/** | ||
* Returns the list of request entries in this batch. | ||
* | ||
* @return a list of request entries for the batch | ||
*/ | ||
public List<RequestEntryT> getBatchEntries() { | ||
return batchEntries; | ||
} | ||
|
||
/** | ||
* Returns the total size in bytes of the batch. | ||
* | ||
* @return the batch's cumulative byte size | ||
*/ | ||
public long getSizeInBytes() { | ||
return sizeInBytes; | ||
} | ||
|
||
/** | ||
* Returns the total number of entries in the batch. | ||
* | ||
* @return the record count in the batch | ||
*/ | ||
public int getRecordCount() { | ||
return recordCount; | ||
} | ||
} |
68 changes: 68 additions & 0 deletions
68
...onnector-base/src/main/java/org/apache/flink/connector/base/sink/writer/BatchCreator.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
|
||
package org.apache.flink.connector.base.sink.writer; | ||
|
||
import org.apache.flink.annotation.PublicEvolving; | ||
import org.apache.flink.connector.base.sink.writer.strategy.RequestInfo; | ||
|
||
import java.io.Serializable; | ||
import java.util.Deque; | ||
|
||
/** | ||
* A pluggable interface for forming batches of request entries from a buffer. Implementations | ||
* control how many entries are grouped together and in what manner before sending them downstream. | ||
* | ||
* <p>The {@code AsyncSinkWriter} (or similar sink component) calls {@link | ||
* #createNextBatch(RequestInfo, RequestBuffer)} (RequestInfo, Deque)} when it decides to flush or | ||
* otherwise gather a new batch of elements. For instance, a batch creator might limit the batch by | ||
* the number of elements, total payload size, or any custom partition-based strategy. | ||
* | ||
* @param <RequestEntryT> the type of the request entries to be batched | ||
*/ | ||
@PublicEvolving | ||
Poorvankbhatia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
public interface BatchCreator<RequestEntryT extends Serializable> { | ||
|
||
/** | ||
* Creates the next batch of request entries based on the provided {@link RequestInfo} and the | ||
* currently buffered entries. | ||
* | ||
* <p>This method is expected to: | ||
* | ||
* <ul> | ||
* <li>Mutate the {@code bufferedRequestEntries} by polling/removing elements from it. | ||
* <li>Return a batch containing the selected entries. | ||
* </ul> | ||
* | ||
* <p><strong>Thread-safety note:</strong> This method is called from {@code flush()}, which is | ||
* executed on the Flink main thread. Implementations should assume single-threaded access and | ||
* must not be shared across subtasks. | ||
* | ||
* <p><strong>Contract:</strong> Implementations must ensure that any entry removed from {@code | ||
* bufferedRequestEntries} is either added to the returned batch or properly handled (e.g., | ||
* retried or logged), and not silently dropped. | ||
hlteoh37 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* | ||
* @param requestInfo information about the desired request properties or constraints (e.g., an | ||
* allowed batch size or other relevant hints) | ||
* @param bufferedRequestEntries a collection ex: {@link Deque} of all currently buffered | ||
* entries waiting to be grouped into batches | ||
* @return a {@link Batch} containing the new batch of entries along with metadata about the | ||
* batch (e.g., total byte size, record count) | ||
*/ | ||
Batch<RequestEntryT> createNextBatch( | ||
RequestInfo requestInfo, RequestBuffer<RequestEntryT> bufferedRequestEntries); | ||
} |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.