Skip to content

Conversation

@salvatore-campagna
Copy link
Contributor

TSDB codec pipeline stages need to serialize metadata (offsets, counts, flags) efficiently. This PR introduces MetadataBuffer, a reusable byte buffer with variable-length integer encoding that follows Lucene's conventions.

The buffer supports three encoding formats:

  • VInt: non-negative integers using 1-5 bytes, 7 bits per byte with continuation flag
  • VLong: non-negative longs using 1-9 bytes, same encoding scheme
  • ZLong: signed longs using zig-zag encoding followed by variable-length encoding, 1-10 bytes, efficient for values with small absolute magnitude

Key design decisions:

  • Zero-allocation steady-state: buffer grows automatically (doubling strategy) and retains capacity after reset(), avoiding allocations in hot paths
  • Safe API: toByteArray() returns a copy of written bytes; no mutable internal state exposed
  • Encoding validation: detects malformed VInt/VLong/ZLong encodings during read and throws IllegalStateException
  • Interface segregation: MetadataWriter and MetadataReader interfaces enable dependency injection and mocking for testing encode/decode stages independently

The class operates in implicit write/read modes. After writing, call setPosition(0) to switch to read mode. The size() method returns bytes written (read boundary), while position() returns current cursor location.

Unit tests cover round-trip correctness for random values, boundary conditions, error handling (negative values, read past end, invalid encodings), buffer growth cycles, and reset/reuse patterns.

Reusable byte buffer with variable-length integer encoding following
Lucene conventions:
- VInt: non-negative integers, 1-5 bytes
- VLong: non-negative longs, 1-9 bytes
- ZLong: signed longs with zig-zag encoding, 1-10 bytes

Buffer grows automatically and retains capacity after reset for
zero-allocation steady-state operation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:StorageEngine/TSDB You know, for Metrics v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants