Skip to content

Conversation

@chessbyte
Copy link
Contributor

@chessbyte chessbyte commented Dec 28, 2025

Issue # (if available)

Follow up to PR #1276

Description of changes

Introduces llrt_tz, a timezone library optimized for fast offset calculations on recent dates. Uses a two-tier architecture:

Architecture:

  • Compact DST rules for current dates - O(1) calculation
  • Compressed historical data - decompressed only when needed

Binary Size Impact:

Binary Size vs Baseline vs chrono-tz
Before PR #1276 (no tz support) 9,787,408
After PR #1276 (chrono-tz) 11,076,512 +1.29 MB
PR #1304 with lz4_flex 10,168,000 +0.37 MB -0.92 MB
PR #1304 with zstd (no dictionary) 10,035,888 +0.24 MB -1.04 MB
PR #1304 with zstd + dictionary 9,986,368 +0.19 MB -1.09 MB

Savings: ~1.1 MB (85% reduction in timezone data size)

Key features:

  • Implements chrono TimeZone trait for compatibility
  • O(1) offset calculation for recent dates using DST rules
  • O(log n) binary search for historical dates after decompression
  • Graceful fallback to standard offset if decompression fails
  • Morocco/Western Sahara always use historical lookup due to Ramadan-based DST suspension (lunar calendar)

Testing:

  • 432 exhaustive comparison tests (each timezone × 56 years × 12 months)
  • ~7 million offset comparisons against chrono-tz, all passing

Checklist

  • Created unit tests in tests/unit and/or in Rust for my feature if needed
  • Ran make fix to format JS and apply Clippy auto fixes
  • Made sure my code didn't add any additional warnings: make check
  • Added relevant type info in types/ directory
  • Updated documentation if needed (API.md/README.md/Other)

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Introduces llrt_tz, a timezone library optimized for fast offset
calculations on recent dates. Uses a two-tier architecture:

Architecture:
- Compact DST rules for current dates - O(1) calculation
- LZ4-compressed historical data - decompressed only when needed

Binary Size Impact:
| Binary                          | Size       | vs Baseline |
|---------------------------------|------------|-------------|
| Before PR awslabs#1276 (no tz support) | 9,787,408  | —           |
| After PR awslabs#1276 (chrono-tz)      | 11,076,512 | +1.29 MB    |
| This PR (llrt_tz)               | 10,168,000 | +0.37 MB    |

Savings: ~908 KB (70% reduction in timezone data size)

Key features:
- Implements chrono TimeZone trait for compatibility
- O(1) offset calculation for recent dates using DST rules
- O(log n) binary search for historical dates after decompression
- Graceful fallback to standard offset if decompression fails
- Morocco/Western Sahara always use historical lookup due to
  Ramadan-based DST suspension (lunar calendar)

Testing:
- 432 exhaustive comparison tests (each timezone × 56 years × 12 months)
- ~7 million offset comparisons against chrono-tz, all passing
Copy link
Contributor

@nabetti1720 nabetti1720 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We commented on some aspects of the LLRT implementation that we felt were slightly different from previous approaches.


[dependencies]
# LZ4 for decompressing historical data
lz4_flex = { version = "0.11", default-features = false, features = ["safe-decode"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In LLRT, zstd is often used for these use cases. Would it be possible to further reduce the size by combining dictionaries and using a higher compression level?

We recognize this is a feature that is needed for investigations into dates that are not currently part of the rules.

There may be a slight disadvantage in decompression time, but since cache is also implemented, this is unlikely to be an issue for most use cases.

If you want similar performance to lz4, using no dictionary and a lower compression level may give you the same results and avoid the need to adopt a new compression crate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nabetti1720 addressed in 2nd commit

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nabetti1720 used dictionary (for even smaller binary) in 3rd commit

Address PR review feedback:
- Update llrt_tz version from 0.1.0 to 0.7.0-beta for consistency
- Replace lz4_flex with zstd for historical data compression
- Use zstd level 19 for better compression ratio

Benefits:
- Removes lz4_flex dependency (zstd already used in LLRT)
- Additional ~129KB binary size reduction
- Total timezone overhead now ~248KB (81% reduction vs chrono-tz)
Train a 32KB dictionary on timezone transition samples to improve
compression ratio for historical data:

- Dictionary captures common patterns across all timezone data
- Each timezone's compressed data benefits from shared patterns
- DecoderDictionary is parsed once and reused for all decompressions

Binary size improvement:
- Without dictionary: 10,035,888 bytes (+248KB vs baseline)
- With dictionary:     9,986,368 bytes (+199KB vs baseline)
- Additional savings: ~48KB

Total timezone overhead is now ~199KB (85% reduction vs chrono-tz).
@nabetti1720
Copy link
Contributor

@chessbyte Thank you for accepting my suggestion!
I don't have much to say, but @richarddavison and @Sytten may think there's still room for improvement, so please wait for their reviews. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants