Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Core: Refactor Table Metadata Tests #11947
base: main
Are you sure you want to change the base?
Core: Refactor Table Metadata Tests #11947
Changes from all commits
063b7f5
92c616b
74bc724
baf6e0b
4786fbe
4d5c87d
8f667de
4f50370
2551587
20c72d0
4adb693
32ffbfd
4aa3158
6e9b00e
d22a24b
1bc5075
34c480a
b2c685f
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HonahX, when I introduced this compatibility check, I purposely called it at write time rather than at read time. I've also pushed back on strictness in the read path like this in the REST catalog and other places. The reason for it is that we don't want to immediately reject metadata on read because that prevents the library from being used to debug or fix metadata issues.
For example, this check validates that initial default values are not used in v2 tables. But what if a table is created by another library that way? Checking compatibility here prevents loading the table at all, let alone using this library to drop the problematic column from the schema using SQL DDL or direct calls. Failing to load the table has much broader consequences, like failing existence checks because
tableExists
callsloadTable
, failing to runinformation_schema
queries that are unrelated, or failing to runexpireSnapshots
and remove old data -- which can cause compliance problems.I think that a better time to fail is when the actual problem caused by the compatibility issue happens. For instance, if there is a default that can't be applied, then the library should fail to read from the table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation! So in general we want to be less restrictive on read side to open the opportunities of fix things instead of rejecting all the errors.
I will revert this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if this is the case we should have some tests that show that parsing invalid metadata is a behavior is allowed by the library. Parsing some invalid Json should not throw an exception for compatibility purposes? I think we could just take a fully populated V3 Metadata and change it's format version to 1 or something. This should be readable (but not really usable)? I'm not sure what other cases we would want, but I think we'd be in a better state if we had tests for behaviors we want to keep in the code.