-
Notifications
You must be signed in to change notification settings - Fork 217
Open
Description
Currently hermes-management has a bug which is caused by deleting topic and then recreating it quickly
The story is similar every time:
- someone deletes a topic
- topic is recreated
- kafka producer has stale metadata for that topic
- kafka producer fails to send messages to the brokers
- messages are buffered in hermes frontend instances
- we need to restart frontend instances in order for the messages to be retransmitted
Issue was thought to be solved with upgrade to kafka client 2.8.2 but again appeared recently. We would like to have a workaround for this.
One of the proposed solutions is to introduce "grace period" for deleted topics. E.g. if someone deletes a topic we should block the creation of topic with same name for long enough so that cluster and kafka producers can be in consistent state. Probably > 5 minutes is enough because metadata is refreshed every 5 minutes.