Decrease latency between time compaction job is added and run

Compactors poll for work w/ exponential backoff.  When compactors are all idle for a while and there is a surge of jobs to do it can take them a bit to all start working.  

One possible way to imporve this is to modify how polling works. The coordinator could hold request from compactors for a time period when nothing is currently queued.  When something is queued it could be immediately given to a held compactor RPC request.  Would not want to hold RPC request for too long because it could be related to a dead compactor.  Could hold request for some time period like 60 to 90 seconds and return nothing if the queue is still empty.  If the compactor is still alive it can make another request for work which will be held again if the queue is currently empty.


Decreasing this latency is good for a system that has lots of small files arriving constantly at tablets.  With a model like this for polling and #4618, very low latency could be achieved for compaction of new bulk imported files.  For minor compacted files would not have a signal like #4618 provides for bulk imports to queue compaction jobs for a tablet.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decrease latency between time compaction job is added and run #4664

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Decrease latency between time compaction job is added and run #4664

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions