Skip to content

PQ could benefit from event-level compression #17819

Open
@yaauie

Description

@yaauie

As write throughput is bound by disk IO, compressing events during serialization could improve throughput at the cost of CPU (see: proof of concept).

If possible, per-event compression should be delivered inside the scope of existing v2 PQ page format, in which entries contain only seqnum+length+N bytes. To do this, the reader will need to be able to handle compressed or uncompressed bytes without additional context (e.g., by differentiating zlib header from existing CBOR first-bytes).

Because not all users will want to spend CPU for increased throughput, and because of later-mentioned rollback barriers, this feature should first be delivered as opt-in, preferably at a per-pipeline level.

Compatibility Considerations

Once a queue contains compressed events, it will be unable to be read by a logstash instance that does not support event decompression; this presents an undesired rollback barrier that would prevent a user from rolling back to a last known-working configuration due to an unrelated issue.

Queue compression should be implemented as opt-in until at least three minor versions have shipped with decompression support.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions