-
Notifications
You must be signed in to change notification settings - Fork 13.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-37298] Added Pluggable Components for BatchStrategy & BufferWrapper in AsyncSinkWriter. #26274
base: master
Are you sure you want to change the base?
[FLINK-37298] Added Pluggable Components for BatchStrategy & BufferWrapper in AsyncSinkWriter. #26274
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, thanks for delivering this. Just a couple of questions and nits.
* | ||
* @param <RequestEntryT> the type of request entry in this batch | ||
*/ | ||
@PublicEvolving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if this fits better as @Internal
or PublicEvolving? It is not intended to be exposed to public users
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @vahmed-hamdy , So Batch is created by BatchCreator interface, which can be inherited and implemented by the Sink Implementor. So i was under the impression that anything that external developers are expected to use should NOT be @Internal as they might change across releases. Flink sink developers might create custom implementations of BatchCreator or BufferWrapper for better performance or sink-specific logic. But please correct me if i am wrong. Will change it if required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I don't have a strong opinion here, I also see classes like AsyncSinkWriterConfiguration
and RateLimitingStrategy
marked PublicEvolving
for the same reason. let's keep it then
* | ||
* @param <RequestEntryT> the type of the request entries to be batched | ||
*/ | ||
@PublicEvolving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as Batch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied above.
* | ||
* @param <RequestEntryT> The type of request entries being buffered. | ||
*/ | ||
@PublicEvolving |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replied above.
AsyncSinkWriterImpl initialSinkWriter = | ||
new AsyncSinkWriterImplBuilder().context(sinkInitContext).build(); | ||
|
||
assertThat(initialSinkWriter.getBufferedRequestEntries()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't test using default when passing nulls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added null BatchCreator & BatchWrapper to AsyncSinkWriterImpl in the test case.
* @param maxBatchSizeInBytes the maximum cumulative size in bytes allowed for any single batch | ||
*/ | ||
private SimpleBatchCreator(long maxBatchSizeInBytes) { | ||
this.maxBatchSizeInBytes = maxBatchSizeInBytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:
this.maxBatchSizeInBytes = maxBatchSizeInBytes; | |
this.maxBatchSizeInBytes = Preconditions.checkArgument(maxBatchSizeInBytes > 0); |
Also can test that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added this and a corresponding test case as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good, thanks for the huge efforts
...-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/BufferWrapper.java
Show resolved
Hide resolved
...-connector-base/src/main/java/org/apache/flink/connector/base/sink/writer/BufferWrapper.java
Outdated
Show resolved
Hide resolved
* construct a new (retry) request entry from the response and add that back to the queue for | ||
* later retry. | ||
* <p>The buffer stores {@link RequestEntryWrapper} objects rather than raw {@link | ||
* RequestEntryT} instances, as buffering wrapped entries allows for better tracking of size and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not seeing the main code issuing retry anywhere ie. add(entry, true) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The retry logic is handled inside the completeRequest() method.
Specifically, when an entry fails, it is re-added at the head of the buffer using:
while (iterator.hasPrevious()) {
addEntryToBuffer(iterator.previous(), true);
}
This ensures failed entries are retried on the next flush.
Let me know if you’d like any further clarification! This part of the code has not been changed and it remains the same. Reference
...onnector-base/src/main/java/org/apache/flink/connector/base/sink/writer/AsyncSinkWriter.java
Outdated
Show resolved
Hide resolved
could we have the docs change to publicise this change and advice on supplying pluggable strategies and BufferWrapper. |
Thanks for the suggestion! The JavaDocs for BatchCreator and BufferWrapper already document their purpose and usage. Regarding additional documentation changes since there's an ongoing vote on this FLIP, would it make sense to wait until finalized and then follow up with documentation updates in a separate PR? Let me know what you think—happy to adjust based on the preferred approach! :) |
8b97772
to
f3c3386
Compare
Hey @dannycranmer can you help with this review? |
What is the purpose of the change
FLIP-509
Brief change log
Verifying this change
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: yesDocumentation