-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add comprehensive unit tests to cover edge cases for Delta log path parsing #408
Conversation
This commit introduces two new unit tests to enhance the robustness of our Delta Lake log path parsing: 1. test_invalid_version_and_part_lengths: - Validates correct handling of invalid version lengths (VERSION_LEN) - Checks parsing of invalid multipart checkpoint lengths (MULTIPART_PART_LEN) - Ensures proper error handling for invalid UUID lengths (UUID_PART_LEN) 2. test_empty_filename: - Verifies that attempts to parse URLs ending with a slash (resulting in empty filenames) are correctly rejected These tests ensure that our parsing logic correctly identifies and rejects invalid log paths, maintaining the integrity of our Delta Lake log processing.
test_ok_none_and_unknown_cases: - Validates handling of non-numeric version prefixes (Ok(None) cases) - Verifies correct parsing of missing file extensions (Ok(None) cases) - Ensures proper handling of empty file extensions (Ok(Some) with Unknown type)
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #408 +/- ##
==========================================
- Coverage 78.34% 77.86% -0.48%
==========================================
Files 49 49
Lines 10282 10360 +78
Branches 10282 10360 +78
==========================================
+ Hits 8055 8067 +12
- Misses 1775 1833 +58
- Partials 452 460 +8 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look at this!
This PR definitely adds some good missed cases, but I think most of the newly added cases were already covered by existing tests.
Suggest to add the missed cases to their respective existing tests (in the same style), rather than making entirely new tests. And maybe update the comments and expect_err
calls of those existing cases, since they aren't as nicely described as what this PR does.
This commit enhances the Delta Lake log path parsing tests by adding explicit boundary test cases: 1. UUID length validation in test_uuid_checkpoint_patterns: - Tests UUID with exactly 35 characters (one too short) - Uses realistic UUID format: "3a0d65cd-4056-49b8-937b-95f9e3ee90e" - Makes length requirements clearer for future maintainers 2. File extension handling in test_unknown_invalid_patterns: - Tests file with no extension (returns Ok(None)) - Tests file with empty extension "." (returns Ok(Some) with Unknown type) - Clarifies the distinction between these two cases These boundary tests improve test coverage and make the requirements more explicit in the test suite, while maintaining the existing test structure and style.
c3653e5
to
edbdf1f
Compare
This PR introduces new unit tests to validate the Delta Lake log path parsing logic corner cases thoroughly:
test_invalid_version_and_part_lengths:
test_empty_filename:
are correctly rejected
test_ok_none_and_unknown_cases:
These tests significantly improve coverage of edge cases and error
conditions in log path parsing.