[Change Proposal] Allow for distributing data views via integration packages

**Summary**
In large clusters with many metric/log data streams, queries can become slow because they often scan indices that are unrelated to the active integration. Integrations should be able to ship managed Data Views (formerly index patterns) scoped to their datasets so visualizations and searches use those by default, reducing the set of queried indices and improving performance.

**Motivation**
- Many integrations (e.g. Kubernetes) ship dashboards and visualizations that currently query broad index patterns such as `metrics-*`.
- In a cluster with dozens of other integrations, queries against `metrics-*` will cause Elasticsearch to consider many unrelated indices. Even when using `data_stream.dataset` filtering (a `constant_keyword`) the engine must still visit shards to rule out matches, so a high index/shard count leads to expensive queries.
- If an integration can provide a managed Data View scoped to its own dataset names (for example `metrics-kubernetes*`), visualizations and saved searches could target that narrower Data View by default. This will limit queries to only the relevant data streams/indices and reduce query cost and latency.

**Proposal**
- Allow Fleet/Integration packages to include managed Data Views under the package's `kibana/saved_objects` assets (or a dedicated `data_views/` package location).
- When a package is installed, Kibana should register those Data Views as package-managed and make them available to the package’s dashboards, visualizations and saved searches.
- Visualizations and dashboards shipped with the package should reference the package-provided Data View by default.
- Provide clear semantics for package-managed Data Views:
  - They should be identifiable as package-managed so users understand edit restrictions.
  - Packages should be able to upgrade/replace these Data Views during package upgrades.

**Example**
- Kubernetes integration ships dashboards that currently query `metrics-*`.
- The package provides a managed Data View `metrics-kubernetes*` that only matches Kubernetes metric data streams.
- After installation, package visualizations use `metrics-kubernetes*` so searches only touch indices belonging to Kubernetes metrics.

**Benefits**
- Reduced query fan-out across unrelated indices and shards.
- Lower query latency and resource usage in clusters with high index counts.

**Open questions / discussion points**
- Best package location/format for bundling Data Views (existing `kibana/saved_objects` vs. new package folder).
- Migration/upgrade semantics when a package changes its Data View (aliases vs replacing objects).
- How to present managed Data Views in the UI.
- Edge cases: multi-dataset packages, packages that must span multiple index name patterns, and cross-package references. Need for multiple Data Views per package?

**Implementation notes (suggested)**
- Reuse the existing saved object format for Data Views and register them as package-managed on install.
- Ensure saved visualizations/dashboards reference the Data View saved object by id, not by pattern text.
- Add validation tooling in elastic-package to help authors create correct Data Views and references.

Please feel free to discuss and correct me, if I got anything wrong.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Change Proposal] Allow for distributing data views via integration packages #998

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Change Proposal] Allow for distributing data views via integration packages #998

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions