-
Notifications
You must be signed in to change notification settings - Fork 446
Roadmap for Xinfra Monitor
Here are a few things in the roadmap that we plan to work on to make Xinfra Monitor more useful.
What's the availability for and how long does it take for the new topic data and metadata information to propagate to every broker in the cluster?
Priority: FY21-Q1
For instance, how long does it take for leadership information of the new partitions to propagate to every broker in the cluster?
Priority: FY21-Q1
For example, how long does it take for the ACL metadata to be communicated/propagated to every broker in the cluster?
Priority: FY21-Q1
It is useful for users to be able to view all Kafka-related metrics from one web service in their organization. Graphite is one of the most popular open source solutions that allow users to store metrics and view metrics as time-series graphs. We plan to improve the existing DefaultMetricsReporterService so that users can export Kafka Monitor metrics to Graphite and other metrics storage services that they choose.
This involves 3rd party libraries and services LinkedIn does not use or is isn't involved too much with. If users in the open source community wants to maintain this feature with sound documentation and tests, that is okay.
Users should have the ability to schedule custom actions (e.g. broker bounce, broker hard kill) to be executed at regular interval. This can be used together with other services to make assertions (e.g. no message loss, no message reorder) about Kafka's performance under a variety of scenarios. This can be deployed your private kafka cluster to test Kafka's performance and fault tolerance.
This is a possibility for other services to implement (or Xinfra Monitor). Perhaps it's more applicable to Cruise Control. It's hard to know when it's safe to do these things unless there is all the data that Cruise Control has.
Another future work is to provide capability to deploy Kafka cluster using Apache Kafka with the user-specified git hash value. This allows us to automatically test a range of Kafka commits to capture bugs that may be missed by Apache Kafka's unit tests or system tests.
This is could be implemented in external services other than Xinfra Monitor, including Cruise Control. It's difficult to know when it's safe to do these things unless there is all the data that Cruise Control has.
Xinfra Monitor (KMF): https://github.com/linkedin/kafka-monitor/
For inquiries
or issues
: https://github.com/linkedin/kafka-monitor/issues/new