Skip to content

Consistent "failed to flush commits before termination" error during consumer group shutdown #614

@Tasyp

Description

@Tasyp

I have the following setup: 2 consumers of different topics inside the same consumer group distributed among 3 nodes and utilizing partition_assignment_strategy=callback_implemented.

Everything works great but there is 1 thing that worries me. During a shutdown, I can consistently see the following statement printed out on different nodes:

group_subscriber_v2 *group-id* failed to flush commits before termination :timeout

This is logged as an error so I treat it as an abnormal execution.

This seems to be a safety mechanism to prevent the call to the group coordinator hang forever:

ok = flush_offset_commits(GroupId, Coordinator),

Could it be related to the usage of the callback-implemented partition assignment strategy? For example, the original group leader is already shutdown, a new one is elected, it starts doing preparatory work and that's when the call to flush offsets call comes in.

Are there any logs/other information I could provide to simplify the investigation?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions