-
-
Notifications
You must be signed in to change notification settings - Fork 253
Simplify flexible versions #1139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
9289e11 to
8622716
Compare
| @classmethod | ||
| @abc.abstractmethod | ||
| def encode(cls, value: T) -> bytes: ... | ||
| def encode(cls, value: T, flexible: bool) -> bytes: ... |
Check notice
Code scanning / CodeQL
Statement has no effect Note
| @classmethod | ||
| @abc.abstractmethod | ||
| def decode(cls, data: BytesIO) -> T: ... | ||
| def decode(cls, data: BytesIO, flexible: bool) -> T: ... |
Check notice
Code scanning / CodeQL
Statement has no effect Note
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1139 +/- ##
==========================================
+ Coverage 94.81% 95.11% +0.29%
==========================================
Files 88 88
Lines 15710 15656 -54
Branches 1374 1364 -10
==========================================
- Hits 14896 14891 -5
+ Misses 569 526 -43
+ Partials 245 239 -6
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
31b6d5a to
b38742d
Compare
b38742d to
c9d49dd
Compare
|
@ods Let me know if you need additional info. The main reason I would need this to improve the API version coverage is to be able to support/specify tagged field properly. I took inspiration from the java client JSON format, where you give the tag of a field along with the name, like here https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/ApiVersionsResponse.json#L64 |
|
@vmaurin Thank you for the contribution, and sorry for the delay. I’m not familiar enough with this code for quick answer, so I need to find some time for research. |
No problem @ods My overall goal here is to have something closer to the java client for schemas definitions. In java client, they have these extended json format (one per API request, one per API response) that are then used to generate a java classes. Being in Python, it is probably better to express schema in Python, and we don't really need the code generation as we have the class level facilities.
|
| UnsignedVarInt32.encode(0) | ||
| if flexible and self.allow_flexible | ||
| else Int16.encode(-1, flexible) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The protocol everywhere specify either STRING or COMPACT_STRING. Why do we switch inside of the single class based on property which is not directly related?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took inspiration from the java client schema's json files like here https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/FindCoordinatorRequest.json#L36
When you specify the schema, it is easier and less to just say "it is a String" and mark the flexible versions rather than having to remember it is a more compact version everywhere. The same of avoiding at each level of schemas to specify it can accept flexible fields.
For flexible fields, like in the java client json files, it is easier to declare it as other fields, with a name and type + the additional tag id, rather than declaring a generic structure on every structs and then having an extra layer of serialization on top
| ("name", String("utf-8")), | ||
| ( | ||
| "partitions", | ||
| CompactArray( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From code here it's not obvious if correct (compact) form will be used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to #1139 (comment)
In the java client json, you can see they just say "it is an array" https://github.com/apache/kafka/blob/trunk/clients/src/main/resources/common/message/AlterPartitionReassignmentsResponse.json#L36
Then, it is because the version is marked "flexible" that it is using the more compact serialization
|
|
||
|
|
||
| class RequestHeader_v0(Struct): | ||
| class RequestHeader_v1(Struct): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do I understand correctly, that it's actually a follow-up fix to the previous PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it is an older "bug" The v0 didn't have a client Id field, so what was named "_v0" here was actually the "_v1" (see https://github.com/apache/kafka/blob/2.8.1/clients/src/main/resources/common/message/RequestHeader.json )
I think it came from a mistake in this KIP https://cwiki.apache.org/confluence/display/KAFKA/KIP-482%3A+The+Kafka+Protocol+should+Support+Optional+Tagged+Fields while it is pretty clear from the schema perspective that the v2 is the first version being "flexible/more compact"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it would be better to cherry-pick it in a separate PR? As this fix is clear and should be certainly merged, while for the rest I'm looking for somebody also to review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you go #1141
The flexible versions is a protocol specificity for newer versions of the API. When an API is flexible, it is using more compact structures and also allow additional "dynamic" fields that could be added without the need to introduce a new API versions. This commit move the flexible versions support to the protocol layer, so it is more transparent and easy when defining Struct classes and schemas. When defining the schema, we can specify a tagged field with a tuple containing the field name and the field tag.
c9d49dd to
ac55e29
Compare
The flexible versions is a protocol specificity for newer versions of the API. When an API is flexible, it is using more compact structures and also allow additional "dynamic" fields that could be added without the need to introduce a new API versions.
This commit move the flexible versions support to the protocol layer, so it is more transparent and easy when defining Struct classes and schemas.
When defining the schema, we can specify a tagged field with a tuple containing the field name and the field tag.
Checklist
CHANGESfolder<issue_id>.<type>(e.g.588.bugfix)issue_idchange it to the pr id after creating the PR.feature: Signifying a new feature..bugfix: Signifying a bug fix..doc: Signifying a documentation improvement..removal: Signifying a deprecation or removal of public API..misc: A ticket has been closed, but it is not of interest to users.Fix issue with non-ascii contents in doctest text files.