-
Notifications
You must be signed in to change notification settings - Fork 385
feat(tz): add compact timezone library replacing chrono-tz #1304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Introduces llrt_tz, a timezone library optimized for fast offset calculations on recent dates. Uses a two-tier architecture: Architecture: - Compact DST rules for current dates - O(1) calculation - LZ4-compressed historical data - decompressed only when needed Binary Size Impact: | Binary | Size | vs Baseline | |---------------------------------|------------|-------------| | Before PR awslabs#1276 (no tz support) | 9,787,408 | — | | After PR awslabs#1276 (chrono-tz) | 11,076,512 | +1.29 MB | | This PR (llrt_tz) | 10,168,000 | +0.37 MB | Savings: ~908 KB (70% reduction in timezone data size) Key features: - Implements chrono TimeZone trait for compatibility - O(1) offset calculation for recent dates using DST rules - O(log n) binary search for historical dates after decompression - Graceful fallback to standard offset if decompression fails - Morocco/Western Sahara always use historical lookup due to Ramadan-based DST suspension (lunar calendar) Testing: - 432 exhaustive comparison tests (each timezone × 56 years × 12 months) - ~7 million offset comparisons against chrono-tz, all passing
710ca95 to
8ebd11e
Compare
nabetti1720
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We commented on some aspects of the LLRT implementation that we felt were slightly different from previous approaches.
libs/llrt_tz/Cargo.toml
Outdated
|
|
||
| [dependencies] | ||
| # LZ4 for decompressing historical data | ||
| lz4_flex = { version = "0.11", default-features = false, features = ["safe-decode"] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In LLRT, zstd is often used for these use cases. Would it be possible to further reduce the size by combining dictionaries and using a higher compression level?
We recognize this is a feature that is needed for investigations into dates that are not currently part of the rules.
There may be a slight disadvantage in decompression time, but since cache is also implemented, this is unlikely to be an issue for most use cases.
If you want similar performance to lz4, using no dictionary and a lower compression level may give you the same results and avoid the need to adopt a new compression crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nabetti1720 addressed in 2nd commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nabetti1720 used dictionary (for even smaller binary) in 3rd commit
Address PR review feedback: - Update llrt_tz version from 0.1.0 to 0.7.0-beta for consistency - Replace lz4_flex with zstd for historical data compression - Use zstd level 19 for better compression ratio Benefits: - Removes lz4_flex dependency (zstd already used in LLRT) - Additional ~129KB binary size reduction - Total timezone overhead now ~248KB (81% reduction vs chrono-tz)
Train a 32KB dictionary on timezone transition samples to improve compression ratio for historical data: - Dictionary captures common patterns across all timezone data - Each timezone's compressed data benefits from shared patterns - DecoderDictionary is parsed once and reused for all decompressions Binary size improvement: - Without dictionary: 10,035,888 bytes (+248KB vs baseline) - With dictionary: 9,986,368 bytes (+199KB vs baseline) - Additional savings: ~48KB Total timezone overhead is now ~199KB (85% reduction vs chrono-tz).
|
@chessbyte Thank you for accepting my suggestion! |
Issue # (if available)
Follow up to PR #1276
Description of changes
Introduces llrt_tz, a timezone library optimized for fast offset calculations on recent dates. Uses a two-tier architecture:
Architecture:
Binary Size Impact:
Savings: ~1.1 MB (85% reduction in timezone data size)
Key features:
Testing:
Checklist
tests/unitand/or in Rust for my feature if neededmake fixto format JS and apply Clippy auto fixesmake checktypes/directoryBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.