feat(slo): Delete stale summary data #244430

kdelemme · 2025-11-26T22:06:53Z

Summary

This PR introduce a new route for purging stale summary documents.
The API is flexible to allow usage as bulk action when a list of SLO ids is specified or as a global action when no list is provided. We default to the stale threshold configured in the slo settings when not specified, and we enforce a greater value when specified, except when force is used.

The modal loads with the default stale threshold settings applied, and the user must confirm the override when entering a lower value.

The status API returns some information about the task, like the number of deleted documents, its completion state and some timing information like how long it took and when it started. It is currently not in used...

I've also cleanup the purge rollup data actions to remove some duplication. Also added some unit and integration tests for the purge instances flow.

elasticmachine · 2025-12-01T18:31:47Z

Pinging @elastic/actionable-obs-team (Team:actionable-obs)

elasticmachine · 2025-12-01T18:32:07Z

Pinging @elastic/obs-ux-management-team (Team:obs-ux-management)

mgiota · 2025-12-02T15:02:53Z

x-pack/solutions/observability/plugins/slo/server/services/purge_instances.ts

+    refresh: false,
+    wait_for_completion: false,
+    conflicts: 'proceed',
+    slices: 'auto',


If there are millions of documents to delete, slices: auto could overwhelm ES. I was looking into throttling and slicing in the official documentation . Have you considered using these optimization techniques?

If there are millions of documents to delete, slices: auto could overwhelm ES

I don't think so. where do you get this from?
slices auto will use the best settings based on the number of shards available. We should use this as trying to set it ourselves when we don't know what is the customer shard settings will be more complex.

slices: auto caught my attention. I searched a bit in the official documentation to check what it means and what it does and here's what I got, Delete by query supports sliced scroll to parallelize the delete process. This can improve efficiency and provide a convenient way to break the request down into smaller parts.. I searched for scroll_size in the codebase and I found a few references, for example here used together with max_docs.

I am fine using it as is until we notice any performance issues. Just wanted to investigate if there are more performant ways for the deleteByQuery.

dmlemeshko

x-pack/solutions/observability/test/api_integration_deployment_agnostic/services/slo_api.ts changes LGTM

mgiota · 2025-12-02T17:17:14Z

I tried to delete one SLO and here's what I got. When I checked the Ignore purge policy restrictions checkbox or I selected 30 days, I didn't get any error.

After I successfully purged the data (I selected 30 days), I expected the SLO to not appear as stale anymore, but it appears as stale. Is it expected? Can we have a tooltip that explains what purge does behind the scenes?

kdelemme · 2025-12-02T18:22:18Z

@mgiota You've tested the "purge rollup data" action, not the "delete stale instance". This one is in the "Actions" menu (big blue button on top right) in the header of the SLO Management page. I can't take a screenshot right now since i'm in a middle of something in another branch.

mgiota · 2025-12-03T10:04:58Z

@kdelemme Yep thanks! I figured it out, I clicked the Purge stale instances actions and it worked as expected. The stale SLO didn't appear as stale anymore in the SLO overview page, but it is still present in the SLO Management page. Is it expected?

kdelemme · 2025-12-03T14:49:29Z

@mgiota

The stale SLO didn't appear as stale anymore in the SLO overview page, but it is still present in the SLO Management page. Is it expected?

Yes since the SLO Management page shows only the SLO definitions, regardless of the number of instances and their state (stale or not). When we purge stale instances, we delete all SLO Instances that have not been updated for X time, but their SLO Definition stays.

kdelemme added 2 commits November 26, 2025 17:00

rename purge rollup route file

19cc182

Add bulk purge summary route and services

6a1873c

github-actions bot added the author:actionable-obs PRs authored by the actionable obs team label Nov 26, 2025

kdelemme added 10 commits November 27, 2025 08:55

change completed when not found

e58b690

Rename confirmation modal bulk purge rollup

9676d75

Add bulk purge summary hook

b7d77f8

Remove duplication of hooks

a382972

refactor purge rollup

cf430d9

Rename to purge instances

1b7e096

Refactor Action types

f32d591

Add basic action menu with purge instances action

764f9d1

Add purge instances modal

c185e52

Add form validation

e98e7ac

kdelemme force-pushed the feat/delete-stale-instances branch from 295175b to e98e7ac Compare December 1, 2025 15:28

kdelemme added 2 commits December 1, 2025 10:40

Add test (cursor)

c3d7c37

Add integration test (cursor)

e612b6c

kdelemme added release_note:skip Skip the PR/issue when compiling release notes Team:actionable-obs Formerly "obs-ux-management", responsible for SLO, o11y alerting, significant events, & synthetics. v9.3.0 backport:skip This PR does not require backporting labels Dec 1, 2025

kdelemme self-assigned this Dec 1, 2025

kdelemme added the ci:beta-faster-pr-build Uses an alternative PR build pipeline with speed optimizations label Dec 1, 2025

kdelemme marked this pull request as ready for review December 1, 2025 18:31

kdelemme requested review from a team as code owners December 1, 2025 18:31

Merge branch 'main' into feat/delete-stale-instances

113be88

botelastic bot added the Team:obs-ux-management label Dec 1, 2025

kdelemme changed the title ~~feat(slo): Bulk delete stale summary data~~ feat(slo): Delete stale summary data Dec 1, 2025

mgiota self-requested a review December 2, 2025 09:14

mgiota reviewed Dec 2, 2025

View reviewed changes

dmlemeshko approved these changes Dec 2, 2025

View reviewed changes

mgiota approved these changes Dec 3, 2025

View reviewed changes

kdelemme merged commit 26e14c0 into elastic:main Dec 3, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(slo): Delete stale summary data #244430

feat(slo): Delete stale summary data #244430

Uh oh!

kdelemme commented Nov 26, 2025 •

edited

Loading

Uh oh!

elasticmachine commented Dec 1, 2025

Uh oh!

elasticmachine commented Dec 1, 2025

Uh oh!

mgiota Dec 2, 2025

Uh oh!

kdelemme Dec 2, 2025

Uh oh!

mgiota Dec 2, 2025

Uh oh!

dmlemeshko left a comment

Uh oh!

mgiota commented Dec 2, 2025 •

edited

Loading

Uh oh!

kdelemme commented Dec 2, 2025

Uh oh!

mgiota commented Dec 3, 2025 •

edited

Loading

Uh oh!

kdelemme commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat(slo): Delete stale summary data #244430

feat(slo): Delete stale summary data #244430

Uh oh!

Conversation

kdelemme commented Nov 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

elasticmachine commented Dec 1, 2025

Uh oh!

elasticmachine commented Dec 1, 2025

Uh oh!

mgiota Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

kdelemme Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

mgiota Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

dmlemeshko left a comment

Choose a reason for hiding this comment

Uh oh!

mgiota commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kdelemme commented Dec 2, 2025

Uh oh!

mgiota commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kdelemme commented Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kdelemme commented Nov 26, 2025 •

edited

Loading

mgiota commented Dec 2, 2025 •

edited

Loading

mgiota commented Dec 3, 2025 •

edited

Loading