You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
When sending messages over the kafka fluent-bit output we see error messages in fluent-bit after a while. These messages appear semi-randomly, not to the 30 second cadence of when we send fluent-bit logs to Kafka.
Also, even though we see this error, we do not lose any messages. Everything is successfully reaching out kafka servers.
To Reproduce
Example log
[2025/03/24 16:44:54] [error] [output:kafka:kafka.1] fluent-bit#producer-2: [thrd:ssl://(address):9096/bootstrap]: 5/5 brokers are down
[2025/03/24 16:44:55] [error] [output:kafka:kafka.2] fluent-bit#producer-3: [thrd:ssl://(address):9096/bootstrap]: 5/5 brokers are down
[2025/03/24 16:45:02] [error] [output:kafka:kafka.0] fluent-bit#producer-1: [thrd:ssl://(address):9096/bootstrap]: 5/5 brokers are down
[2025/03/24 16:45:15] [error] [output:kafka:kafka.1] fluent-bit#producer-2: [thrd:ssl://(address):9096/bootstrap]: 5/5 brokers are down
[2025/03/24 16:45:15] [error] [output:kafka:kafka.2] fluent-bit#producer-3: [thrd:ssl://(address):9096/bootstrap]: 5/5 brokers are down
[SERVICE]
Flush 30
Daemon off
tls on
tls.verify on
tls.ca_path /etc/ssl/
Expected behavior
We expect to not see these errors if all 5 of our servers are running successfully.
Your Environment
Version used: 3.0.7 and 3.2.10
Environment name and version (e.g. Kubernetes? What version?): Kubernetes, GitVersion:"v1.18.2-rc2+k3s1"
Server type and version:
Operating System and version: Ubuntu 20.04.6 LTS
Filters and plugins: [FILTERS]: nest, modify, record_modifier, lua,
Additional context
We were looking at the open connections using net-tools and nsenter -t. What we found was that we would see many open connections as fluent-bit remained active with our 5 brokers. Three inputs using this and 5 brokers meant we were seeing around 15 open connections at a time.
To reduce this we implemented rdkafka.connections.max.idle.ms 20000, which brings the connections back down to 3-4 (for our 3 inputs). However, we see this broker is down error. Increasing this to 70000 gets rid of the error, but increases our connections to 4-6.
The text was updated successfully, but these errors were encountered:
Bug Report
Describe the bug
When sending messages over the kafka fluent-bit output we see error messages in fluent-bit after a while. These messages appear semi-randomly, not to the 30 second cadence of when we send fluent-bit logs to Kafka.
Also, even though we see this error, we do not lose any messages. Everything is successfully reaching out kafka servers.
To Reproduce
On the main config:
Expected behavior
We expect to not see these errors if all 5 of our servers are running successfully.
Your Environment
Additional context
We were looking at the open connections using net-tools and nsenter -t. What we found was that we would see many open connections as fluent-bit remained active with our 5 brokers. Three inputs using this and 5 brokers meant we were seeing around 15 open connections at a time.
To reduce this we implemented rdkafka.connections.max.idle.ms 20000, which brings the connections back down to 3-4 (for our 3 inputs). However, we see this broker is down error. Increasing this to 70000 gets rid of the error, but increases our connections to 4-6.
The text was updated successfully, but these errors were encountered: