-
Notifications
You must be signed in to change notification settings - Fork 86
Open
Labels
discussIssue needs discussionIssue needs discussion
Description
Summary
In large clusters with many metric/log data streams, queries can become slow because they often scan indices that are unrelated to the active integration. Integrations should be able to ship managed Data Views (formerly index patterns) scoped to their datasets so visualizations and searches use those by default, reducing the set of queried indices and improving performance.
Motivation
- Many integrations (e.g. Kubernetes) ship dashboards and visualizations that currently query broad index patterns such as
metrics-*. - In a cluster with dozens of other integrations, queries against
metrics-*will cause Elasticsearch to consider many unrelated indices. Even when usingdata_stream.datasetfiltering (aconstant_keyword) the engine must still visit shards to rule out matches, so a high index/shard count leads to expensive queries. - If an integration can provide a managed Data View scoped to its own dataset names (for example
metrics-kubernetes*), visualizations and saved searches could target that narrower Data View by default. This will limit queries to only the relevant data streams/indices and reduce query cost and latency.
Proposal
- Allow Fleet/Integration packages to include managed Data Views under the package's
kibana/saved_objectsassets (or a dedicateddata_views/package location). - When a package is installed, Kibana should register those Data Views as package-managed and make them available to the package’s dashboards, visualizations and saved searches.
- Visualizations and dashboards shipped with the package should reference the package-provided Data View by default.
- Provide clear semantics for package-managed Data Views:
- They should be identifiable as package-managed so users understand edit restrictions.
- Packages should be able to upgrade/replace these Data Views during package upgrades.
Example
- Kubernetes integration ships dashboards that currently query
metrics-*. - The package provides a managed Data View
metrics-kubernetes*that only matches Kubernetes metric data streams. - After installation, package visualizations use
metrics-kubernetes*so searches only touch indices belonging to Kubernetes metrics.
Benefits
- Reduced query fan-out across unrelated indices and shards.
- Lower query latency and resource usage in clusters with high index counts.
Open questions / discussion points
- Best package location/format for bundling Data Views (existing
kibana/saved_objectsvs. new package folder). - Migration/upgrade semantics when a package changes its Data View (aliases vs replacing objects).
- How to present managed Data Views in the UI.
- Edge cases: multi-dataset packages, packages that must span multiple index name patterns, and cross-package references. Need for multiple Data Views per package?
Implementation notes (suggested)
- Reuse the existing saved object format for Data Views and register them as package-managed on install.
- Ensure saved visualizations/dashboards reference the Data View saved object by id, not by pattern text.
- Add validation tooling in elastic-package to help authors create correct Data Views and references.
Please feel free to discuss and correct me, if I got anything wrong.
Metadata
Metadata
Assignees
Labels
discussIssue needs discussionIssue needs discussion