-
Notifications
You must be signed in to change notification settings - Fork 8.5k
feat(slo): Delete stale summary data #244430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
295175b to
e98e7ac
Compare
|
Pinging @elastic/actionable-obs-team (Team:actionable-obs) |
|
Pinging @elastic/obs-ux-management-team (Team:obs-ux-management) |
| refresh: false, | ||
| wait_for_completion: false, | ||
| conflicts: 'proceed', | ||
| slices: 'auto', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are millions of documents to delete, slices: auto could overwhelm ES. I was looking into throttling and slicing in the official documentation . Have you considered using these optimization techniques?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there are millions of documents to delete, slices: auto could overwhelm ES
I don't think so. where do you get this from?
slices auto will use the best settings based on the number of shards available. We should use this as trying to set it ourselves when we don't know what is the customer shard settings will be more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
slices: auto caught my attention. I searched a bit in the official documentation to check what it means and what it does and here's what I got, Delete by query supports sliced scroll to parallelize the delete process. This can improve efficiency and provide a convenient way to break the request down into smaller parts.. I searched for scroll_size in the codebase and I found a few references, for example here used together with max_docs.
I am fine using it as is until we notice any performance issues. Just wanted to investigate if there are more performant ways for the deleteByQuery.
dmlemeshko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
x-pack/solutions/observability/test/api_integration_deployment_agnostic/services/slo_api.ts changes LGTM
|
@mgiota You've tested the "purge rollup data" action, not the "delete stale instance". This one is in the "Actions" menu (big blue button on top right) in the header of the SLO Management page. I can't take a screenshot right now since i'm in a middle of something in another branch. |
|
@kdelemme Yep thanks! I figured it out, I clicked the |
Yes since the SLO Management page shows only the SLO definitions, regardless of the number of instances and their state (stale or not). When we purge stale instances, we delete all SLO Instances that have not been updated for X time, but their SLO Definition stays. |


Fix #210049
Summary
This PR introduce a new route for purging stale summary documents.
The API is flexible to allow usage as bulk action when a list of SLO ids is specified or as a global action when no list is provided. We default to the stale threshold configured in the slo settings when not specified, and we enforce a greater value when specified, except when force is used.
The modal loads with the default stale threshold settings applied, and the user must confirm the override when entering a lower value.
The status API returns some information about the task, like the number of deleted documents, its completion state and some timing information like how long it took and when it started. It is currently not in used...
I've also cleanup the purge rollup data actions to remove some duplication. Also added some unit and integration tests for the purge instances flow.