Skip to content

Kafka output duplicate messages and idempotent producer support #48633

@zzzwqh

Description

@zzzwqh

**Describe the enhancement:idempotent producer

**Describe a specific use case for the enhancement or feature:

We have observed duplicate data delivery on the Kafka side. Could the following scenario be the cause, based on my Filebeat output configuration below?
Filebeat sends a batch of logs → Kafka successfully writes the data → Kafka prepares to return an ACK → Network jitter, packet loss, or latency occurs → Filebeat does not receive the ACK → Filebeat determines the delivery as failed → Retries sending → Kafka writes the data again → Duplicate data is generated.
Does Filebeat 7.9.3 support the idempotent producer? Or is this feature supported in other versions?

my filebeat config file:

output.kafka:
hosts: [kafka1.com:9092,kafka2.com:9092,kafka3.com:9092]
worker: 4
loadbalance: true
version: 2.0.0
topic: "xxx_%{[data.xxx]}"
partition.round_robin:
group_events: 100
reachable_only: true
required_acks: -1
compression: gzip
compression_level: 9
bulk_max_size: 40960
bulk_flush_frequency: 1s
max_message_bytes: 1000000

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs_teamIndicates that the issue/PR needs a Team:* label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions