Skip to content

Support keeping only jobs in database (while removing all logs and test results sooner)#6603

Merged
mergify[bot] merged 20 commits intoos-autoinst:masterfrom
Martchus:jobs-db-only
Jul 29, 2025
Merged

Support keeping only jobs in database (while removing all logs and test results sooner)#6603
mergify[bot] merged 20 commits intoos-autoinst:masterfrom
Martchus:jobs-db-only

Conversation

@Martchus
Copy link
Contributor

See https://progress.opensuse.org/issues/179263 and particular commit messages

Copy link
Member

@okurz okurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider the most important change actually to be in documentation. Please see http://open.qa/docs/#basic_cleanup

Copy link
Member

@okurz okurz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we actually need to distinguish "storing jobs" and "storing results" on so many levels. I know that already the distinction between "logs" and "results" is not well understood for many users. Instead can we just review the cleanup algorithm and only clean actual job entries as last resort after all other relevant job results have been cleaned in a job group? Like first delete all results and if still more space is needed only then remove the job entry (in the database) itself?

@Martchus
Copy link
Contributor Author

I actually did this "documentation driven" implementing the help popover changes that ended up in Extend UI for configuring job storage duration as one of the first changes.

I will of course also extend the real documentation. I just have to think how to handle existing instances where e.g. result_storage_duration is set in the config file to something higher than the new default for job_storage_duration. Hence I also added the "not-ready" label.

You can still review the other parts of the PR. Considering your 2nd comment I will not put anymore effort into this until we clarified what we actually want. (Although I probably won't have enough time in the spike solution for a 2nd attempt so I'd just close this PR if this approach is not wanted.)


I wonder if we actually need to distinguish "storing jobs" and "storing results" on so many levels. I know that already the distinction between "logs" and "results" is not well understood for many users.

I guess users are only exposed to this at the UI-level (and perhaps API-level but this is really just the same again from a user perspective). So I don't see how dealing with this on many levels internally will confuse users.

Note that I have already added help for users in the UI in form of changes to the help popover. Users can also still keep the defaults for this new setting if they don't want to be bothered with it. The new setting is also inline with the existing ones and doesn't bring much more to the table in terms of complexity.

If you think our users cannot take another form field I can remove the UI changes so this becomes a hidden feature. I wouldn't do that because I think our users can take another form field and we should be as transparent about the cleanup as possible to help users understand better how it actually works. (Maybe that's just my preference as I don't like dumbed down UIs.)

Instead can we just review the cleanup algorithm and only clean actual job entries as last resort after all other relevant job results have been cleaned in a job group? Like first delete all results and if still more space is needed only then remove the job entry (in the database) itself?

This makes no sense. The results are usually not on the same storage as the database. So if more storage on the disk where results are stored is needed deleting the job entry from the database has no effect. Even if both were on the same disk then deleting rows from the database will most likely not have an immediate effect. And even if it has an effect at some point then it is probably very little compared to what we gain by deleting results.

I think this feature was intended with the opposite thought in mind: The database entry of a job needs not much disk space but results potentially a lot. So we can store much more database entries than results. It seems unnecessary to delete jobs from the database (and thus not having them anymore on dashboard pages) when we really only run out of result storage. This change helps preserving jobs in the database longer. It is not meant to delete data more aggressively.

@Martchus
Copy link
Contributor Author

Oh, and I would really like to avoid repeating the approach we took for archiving. Here we archive jobs which are "preserved because they are important". While it works for archiving I must say that adding this kind of "condition" does add more complexity and makes things therefore harder to understand. Adding just another configurable duration is on the other hand just more of the same and doesn't add complexity and thus doesn't make things harder to understand.

@Martchus Martchus mentioned this pull request Jul 23, 2025
@okurz
Copy link
Member

okurz commented Jul 23, 2025

I think this feature was intended with the opposite thought in mind: The database entry of a job needs not much disk space but results potentially a lot. So we can store much more database entries than results. It seems unnecessary to delete jobs from the database (and thus not having them anymore on dashboard pages) when we really only run out of result storage. This change helps preserving jobs in the database longer. It is not meant to delete data more aggressively.

Exactly. But I think a "job group owner" neither understands this distinction nor would be able to derive what that could mean for their job group. Obviously with the configurations for "keep results" and "keep jobs" next to each other I would assume that users compare those values and try to select something sensible but they can't know what effect on available database storage that actually has.

Oh, and I would really like to avoid repeating the approach we took for archiving. Here we archive jobs which are "preserved because they are important". While it works for archiving I must say that adding this kind of "condition" does add more complexity and makes things therefore harder to understand. Adding just another configurable duration is on the other hand just more of the same and doesn't add complexity and thus doesn't make things harder to understand.

I think "archiving" is a good example of leaving the decision to the administrators of an instance which are also the ones need to ensure that storage space is not depleted. How about a simpler approach but still leaving the choice to the administrator: Similar to our implementation for "space aware cleanup" only clean jobs from the database when it's actually necessary. As you explained it can be expected that the database are residing on different storage than results so how about two config settings for number of jobs as well as a threshold to keep free storage configured in the ini file with sane default and only remove jobs when any of those two limits are reached?

@mergify
Copy link
Contributor

mergify bot commented Jul 23, 2025

This pull request is now in conflicts. Could you fix it? 🙏

@Martchus
Copy link
Contributor Author

Martchus commented Jul 23, 2025

… but they can't know what effect on available database storage that actually has.

Users and admins would both need to play with these settings to know the effect on available database storage.

I think the current approach is the following (with the existing retentions and the new one would not change that): Users put in values that make sense for them without knowing whether it is too much. They might also just leave the defaults we put in there as admins if they don't really care about this. If we find that retentions are too long we need to reduce them from our side in coordination with users. So these settings aren't just for users but also for us as admins. And with the newly introduced setting we would have the chance of preserving jobs at least longer in the database when we will have to reduce the storage duration for results. I think the current approach isn't that bad and this change makes it better.

So ok, this means that this setting is likely more for us as admins to have it easier in case we need to reduce retentions (e.g. after receiving a filesystem alert). Users will probably just stick to the defaults here for the newly introduced settings. I don't think our users will be overwhelmed by the new settings (as they would probably just ignore them) but I could hide/collapse these settings by default.

By the way, I think our documentation in the popover is very bad. So maybe that's also just the reason why users are confused. I improved it a little bit as part of this change. I hope it is now clear what the difference between "logs" and "results" (and now also "database entry") actually is. What's still missing is a clarification on what "important" actually means.

I think "archiving" is a good example of leaving the decision to the administrators of an instance which are also the ones need to ensure that storage space is not depleted.

This is not true. The archiving feature doesn't allow anybody to make a real decision on when to archive jobs except for setting a "binary" yes/no flag on the admin side to enable/disable the feature at all. If anybody can still influence this in a more nuanced way then that are the users and admins via the other retentions but that's of course just very indirectly and also a bit intransparent (as not everyone reads the documentation and the behavior is generally more complex as mentioned before).

Similar to our implementation for "space aware cleanup" only clean jobs from the database when it's actually necessary.

This makes the whole thing much more complicated because for this we would need to define "necessary". Note that we are not just talking about disk space being used here but also about operations taking longer in the presence of many jobs. And also just the space aspect alone would be not super easy to implement as we would probably have to ask PostgreSQL about the figures (as a plain df is probably misleading). And then we would probably also have to trigger vacuuming for this to have an actual effect.

Additionally, the database is storing much more than just jobs so it would make much more sense to decide case by case what to store less of if we receive an alert, e.g. we might decide to store less audit events instead. (By the way, we completely lack cleanup of scheduled products which I would find much more important than making cleanup of jobs "super intelligent/automatic". Limiting scheduled products would also help us filter/sort this table faster (and fast enough in the future) without any additional indexes - these kinds of cleanups are really not only about storage capacity.)

As you explained it can be expected that the database are residing on different storage than results so how about two config settings for number of jobs as well as a threshold to keep free storage configured in the ini file with sane default and only remove jobs when any of those two limits are reached?

As mentioned, it won't be as straight forward for the database. In case of the space-aware cleanup we really just call df after deleting a few more files from disk and see whether we can abort early. This is not going to work for deleting database rows. Note that we could still look into a space-aware cleanup in the future if we really want to spend the effort to make this actually work despite the mentioned caveats. This can co-exist with configurable retentions.


Considering I'm not making things really more complex here (by just adding more of the same) I wouldn't have expected that this change is so problematic. I guess if I changed it so these settings are collapsed in the UI by default then this really shouldn't bother users but still gives admins more control. And not even the function to delete only results (but keeping the jobs in the database) is new because this function is already used by our space aware cleanup.

@okurz
Copy link
Member

okurz commented Jul 23, 2025

I think the current approach is the following (with the existing retentions and the new one would not change that): Users put in values that make sense for them without knowing whether it is too much. They might also just leave the defaults we put in there as admins if they don't really care about this. If we find that retentions are too long we need to reduce them from our side in coordination with users. So these settings aren't just for users but also for us as admins.

Yes, that is the current approach and I find it suboptimal as it is a constant back and forth between users and admins. I would like to find a way where users and admins don't need to argue over the same values but all use different parameters that they care about. Think of a scenario where we would have a ten times bigger openQA instance or compare to github. There are also no settings in github projects that you and me set and sometimes github admins would contact us and say "hey, we need to lower those values you set".

I don't think our users will be overwhelmed by the new settings (as they would probably just ignore them) but I could hide/collapse these settings by default.

Please take a look into recent problems we had with overuse of results and logs and related discussions with job group owners in the following SUSE internal message threads for context:

What those threads should show is that job group owners in multiple cases did not understand the meaning of results, logs, important or not important. And they chose values like 180/180/180/180 so all the same.

By the way, I think our documentation in the popover is very bad.

That might be true and I like your improvements.

What's still missing is a clarification on what "important" actually means.

ok. So as this PR is for a spike solution how about doing the "important" clarification first and then think about how to do "keeping only jobs" in a better way.

And not even the function to delete only results (but keeping the jobs in the database) is new because this function is already used by our space aware cleanup.

Yes. I am aware. That's why I created the ticket assuming that we have many building blocks in place already. Your changes to the help text are also good and I like those. But I am still concerned about asking users to make choices which would be more suitable for admins.

Considering I'm not making things really more complex here (by just adding more of the same) I wouldn't have expected that this change is so problematic.

We could simply leave out all job group attributes and UI changes for now and only keep the database rows and default values as they can still be set over the config and should have a good default. Then we can collect experience, improve the definition of "important" and then come back to deciding about which UI integration is useful. If after reading all of the above you still think that openQA would scale well - "think ten times bigger" - with this current implementation including the UI integration then I will approve.

@Martchus
Copy link
Contributor Author

Yes, that is the current approach and I find it suboptimal as it is a constant back and forth between users and admins. I would like to find a way where users and admins don't need to argue over the same values but all use different parameters that they care about.

Ok - although I don't see how adding another parameter makes things worse now.

Think of a scenario where we would have a ten times bigger openQA instance or compare to github. There are also no settings in github projects that you and me set and sometimes github admins would contact us and say "hey, we need to lower those values you set".

For 10x bigger we would probably need a different database setup and probably a different setup overall. Having everything served by a single host probably won't cut it anymore. I think GitHub and other services can grow in such an unlimited way because they can just add more servers. I don't think we should make very advanced/big/expensive setups like these the next step for openQA, though. We can still help users/admins with simple setups and more limited resources.

By the way, I was recently curious how much disk space one has on GitHub as I upload assets for my releases there. It turns out GitHub actually does ask their users to stay within limits to avoid performance problems.

And another by the way, many other big platforms directly confront users with limits simply to ask them to pay at some point. I'm not sure whether big platforms are generally a good comparison but this just shows that users often are confronted with limits. So it is not like there's "a perfect world out there" and we need to reach the same level - as it seems you want me to believe.

Please take a look into recent problems we had with overuse …

Ok, I'm not reading these different threads right now. However, I think I know some of them anyway. And that is also why I propose to improve the help popover here. (And because I simply couldn't add anything to the existing popover without breaking its currently terrible structure.)

ok. So as this PR is for a spike solution how about doing the "important" clarification first and then think about how to do "keeping only jobs" in a better way.

Ok, I can create a separate PR for UI improvements first. It'll be easier and more straight forward it I was able to base further changes on this PR, though.

Yes. I am aware. That's why I created the ticket assuming that we have many building blocks in place already.

But considering we already delete only results first as part of the space aware cleanup and you are against the additions I am proposing here I'm not really sure what you are asking for. Some "magic solution" to make things better probably. Sorry, but all your suggestions so far seem impractical and really leading much further than anyone reading the ticket you've created could have guessed. (At least we didn't guess that when moving the ticked into the backlog. At this point @kalikiana and @perlpunk and me actually briefly talked about this.)

But I am still concerned about asking users to make choices which would be more suitable for admins.

I think collapsing these settings into an advanced/admin section will be good enough. Of course it isn't ideal that all these settings are mixed on a single page even though different user groups probably care about them to varying degrees. There is probably no sharp line that can be drawn between admins and users in our case so it also makes kind of sense. (And it also makes sense because distributing these settings would make it much harder to find them in general.)

If after reading all of the above you still think that openQA would scale well - "think ten times bigger" - with this current implementation including the UI integration then I will approve.

I think openQA will not scale 10x bigger regardless of this particular change. However, I don't think this change will get in the way of scalability. As I mentioned before, this is really not adding anything new. Just another setting on a level where we already have very similar settings.

We could simply leave out all job group attributes and UI changes for now and only keep the database rows and default values as they can still be set over the config and should have a good default. Then we can collect experience, improve the definition of "important" and then come back to deciding about which UI integration is useful.

Having to change this in the database manually (and having to use the database just to check the current values) seems very cumbersome. Would it be ok for you if I collapse these two new settings as mentioned before? If not, would it be ok for you if I hid these settings by default but show them only when a certain query parameter is specified? If the answer to any of those questions is yes than I would implement that and also additional improvements to the help popover only top of that (because it is easier).

@okurz
Copy link
Member

okurz commented Jul 24, 2025

Would it be ok for you if I collapse these two new settings as mentioned before?

Yes, ok. After you provided your counter points on all previous statements and still think it's good to bring in the UI changes I can also accept the current state of this PR as is without additional folding considering that by default all the job group properties are anyway already "folded away".

If not, would it be ok for you if I hid these settings by default but show them only when a certain query parameter is specified?

Not necessary, see above

If the answer to any of those questions is yes than I would implement that and also additional improvements to the help popover only top of that (because it is easier).

Fine with the order of implementing that. As soon as conflicts and CI checks are resolved I will approve the PR without needing further changes to the UI

@Martchus
Copy link
Contributor Author

Yes, ok. After you provided your counter points on all previous statements and still think it's good to bring in the UI changes I can also accept the current state of this PR as is without additional folding considering that by default all the job group properties are anyway already "folded away".

That sounds good but maybe I will look into additional folding because that's still would make sense and I have still some time in the timebox for that. (If the UI becomes just more bulky/complicated due to the folding I will leave it out, though.)

Fine with the order of implementing that. As soon as conflicts and CI checks are resolved I will approve the PR without needing further changes to the UI

Thanks. And if we can come up with more concrete ideas (that seem doable with reasonable effort/complexity) I think we can still follow-up on them.

I will still have to add one more change. So far, if one puts a result_storage_duration that is longer than the default for the new job_storage_duration then results will now be stored shorter. So I have to make it so that job_storage_duration defaults to at least result_storage_duration (probably a small change).

@github-actions
Copy link

Great PR! Please pay attention to the following items before merging:

Files matching docs/*.asciidoc:

This is an automatically generated QA checklist based on modified files.

@Martchus
Copy link
Contributor Author

I pushed a new version. This should have everything except the folding on the UI as I haven't had the time to look into it yet. (Improving the documentation took a little bit longer than expected and we also have these npm issues to deal with.)

@Martchus
Copy link
Contributor Author

I restructured the config page a little bit:

screenshot_20250725_120102

Most importantly the description is now the 2nd field and not easy to overlook at the end. The new settings are collapsed by default. I replaced the verbose sentence with the link to the documentation with a button so no additional space is used. I also refactored the code so group_property_editor.html.ep isn't becoming too big.

By the way, the lengthy note "Results and logs might be deleted earlier than configured …" at the end was there before and is about the space aware cleanup we already have. (It is only shown if the space aware cleanup is enabled.)

@codecov
Copy link

codecov bot commented Jul 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.12%. Comparing base (8437ee2) to head (3842f76).
⚠️ Report is 23 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6603   +/-   ##
=======================================
  Coverage   99.12%   99.12%           
=======================================
  Files         400      400           
  Lines       40885    40940   +55     
=======================================
+ Hits        40526    40581   +55     
  Misses        359      359           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@okurz
Copy link
Member

okurz commented Jul 25, 2025

I restructured the config page a little bit

Very nice. The only suggestion I have is to change "Toggle advanced cleanup settings" to "Show advanced cleanup settings". "Toggle" sounds like pressing this would actually change settings which it doesn't, right? It just shows settings

KEEP_IMPORTANT_LOGS_IN_DAYS => 120,
KEEP_RESULTS_IN_DAYS => 365,
KEEP_IMPORTANT_RESULTS_IN_DAYS => 0,
KEEP_JOBS_IN_DAYS => 365,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you selected 365 so that the behaviour (by default) is effectively the same as in before, right?
I think that is a good idea.

So for later how about making use of the new feature by default by selecting an according number higher than "KEEP_RESULTS_IN_DAYS"

Suggested change
KEEP_JOBS_IN_DAYS => 365,
KEEP_JOBS_IN_DAYS => 730,

If you think this is too much then I suggest to lower KEEP_RESULTS_IN_DAYS to something like 200 and set KEEP_JOBS_IN_DAYS to 365 then.

@Martchus
Copy link
Contributor Author

I changed the name of the button. I also made the advanced fields expand automatically if an error occurs when saving that relates to those fields. When doing further testing I also noticed a few other details (with the error handling in general, not with the new fields) which needed fixing so I added two more commits for this as well.

I assume you selected 365 so that the behaviour (by default) is effectively the same as in before, right?
I think that is a good idea.

Yes, that was the idea.

So for later how about making use of the new feature by default by selecting an according number higher than "KEEP_RESULTS_IN_DAYS"

I think putting in a higher number as default would make sense. I assume with "later" you mean in a follow-up PR?

Comment on lines +208 to +201
push @errors, "'$log_key' must be <= '$result_key'" if $result_value != 0 && $log_value > $result_value;
push @errors, "'$result_key' must be <= '$job_key'" if $job_value != 0 && $result_value > $job_value;
Copy link
Contributor Author

@Martchus Martchus Jul 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, depending on user feedback we could also simply extend the too low values here. This way users would be confronted with fewer errors and openQA would just do what they had probably in mind anyway. We could still show a reworded version of these messages as information to make it clear what happened.

@okurz
Copy link
Member

okurz commented Jul 28, 2025

So for later how about making use of the new feature by default by selecting an according number higher than "KEEP_RESULTS_IN_DAYS"

I think putting in a higher number as default would make sense. I assume with "later" you mean in a follow-up PR?

Yes, please.

@Martchus
Copy link
Contributor Author

Ok, I'll create a follow up once this PR has been merged.

@d3flex
Copy link
Contributor

d3flex commented Jul 28, 2025

I restructured the config page a little bit

Very nice. The only suggestion I have is to change "Toggle advanced cleanup settings" to "Show advanced cleanup settings". "Toggle" sounds like pressing this would actually change settings which it doesn't, right? It just shows settings

if you go with this I would also say the templates/webapi/admin/group/toggle_advanced.html.ep should be adjusted

Martchus added 16 commits July 28, 2025 20:45
This will allow configuring the storage duration of the job itself in the
database, see https://progress.opensuse.org/issues/179263.
This will allow configuring the storage duration of the job itself in the
database, see https://progress.opensuse.org/issues/179263.
This will allow configuring the storage duration of the job itself in the
database, see https://progress.opensuse.org/issues/179263.
This keeps jobs in the database as long as configured specifically via
`keep_jobs_in_days`/`keep_important_jobs_in_days`. The configurations
`keep_results_in_days`/`keep_important_results_in_days` are now really only
used for deleting results.

Related ticket: https://progress.opensuse.org/issues/179263.
It is possible to configure only one of the limits/durations explicitly,
e.g. `log_storage_duration = 400`. So far the entire results would still be
deleted after 365 days as this is our default for `result_storage_duration`.
So the configuration has no effect. This is probably not what an admin
configuring this had in mind.

With the introduction of `job_storage_duration` this problem is more
apparent as one might have put e.g. `result_storage_duration = 400` or
even `result_storage_duration = 0` into their config file but now we would
still delete the whole job after 365 days as this is the default limit for
`job_storage_duration`.

This change extends the default durations so these settings are still
effective. So if someone put `result_storage_duration = 400` or
`result_storage_duration = 0` in their config file the results would be
kept for 400 days or forever - just as before the introduction of
`job_storage_duration`.

Note that we introduced a validation for this kind of ordering between the
retentions in 026d72a. This change did not
cover the configuration. We should perhaps print a warning in case the
values are invalid. This could be done in the function
`_set_default_storage_durations` introduced by this commit if that is
wanted.

Related ticket: https://progress.opensuse.org/issues/179263
* Improve formatting and wrapping
* Mention the differences between the different settings
* Mention that also a link label can be used to mark a job as important
* Update the contents after introducing "Keep (important) jobs for"
These descriptions were too long for the getting started section - also
before introducing "Keep (important) jobs for". Ideally new users don't
need this detailed explanation as there is now also a help popover on the
relevant page with a more compact explanation.

This way there's also less cross-referencing necessary.

Related ticket: https://progress.opensuse.org/issues/179263
* Show the description directly after the name instead of at the end
* Place all cleanup related properties at the end
* See https://progress.opensuse.org/issues/179263
* Mention the different settings in the same order as they appear in the
  form
* Move code into a separate file fore easier editing
* See https://progress.opensuse.org/issues/179263
This looks nicer, saves space and avoids a separate paragraph/sentence just
for a link to the documentation.

Related ticket: https://progress.opensuse.org/issues/179263
* Add missing colon
* Use an icon that makes sense
Otherwise, when an error message that relates to advanced fields is shown,
the advanced field itself might be hidden which is probably confusing.

Related ticket: https://progress.opensuse.org/issues/179263
This is a follow-up of 026d72a which introduced this kind of validation
but overlooked fields for important jobs.

Related ticket: https://progress.opensuse.org/issues/64830
@Martchus Martchus force-pushed the jobs-db-only branch 2 times, most recently from 35709c9 to 6b61de8 Compare July 28, 2025 18:50
@Martchus
Copy link
Contributor Author

Martchus commented Jul 29, 2025

Actually, with the database migration in place that assigns the current value of keep_results_in_days to keep_jobs_in_days we will not see any difference if we later increase the constant of KEEP_JOBS_IN_DAYS for existing groups. Only new groups relying on defaults would be affected.

I'm not sure whether that's what we want. I could change the assignment in that migration to happen only if keep_results_in_days is already bigger than KEEP_JOBS_IN_DAYS. Then we would change the cleanup behavior for existing groups to store job in the database longer (365 days) if keep_results_in_days was reduced (to be below 365 days).

Note that if we merge this as-is we can still easily keep jobs longer in the database for all groups afterwards. Just bumping KEEP_JOBS_IN_DAYS won't be enough but with a simple database query we can make these groups use the default again:

UPDATE job_groups SET keep_jobs_in_days = null WHERE keep_jobs_in_days <= 720 and keep_jobs_in_days != 0;
UPDATE job_group_parents SET default_keep_jobs_in_days = null WHERE default_keep_jobs_in_days <= 720 and default_keep_jobs_in_days != 0;

@d3flex
Copy link
Contributor

d3flex commented Jul 29, 2025

Actually, with the database migration in place that assigns the current value of keep_results_in_days to keep_jobs_in_days we will not see any difference if we later increase the constant of KEEP_JOBS_IN_DAYS for existing groups. Only new groups relying on defaults would be affected.

I'm not sure whether that's what we want. I could change the assignment in that migration to happen only if keep_results_in_days is already bigger than KEEP_JOBS_IN_DAYS. Then we would change the cleanup behavior for existing groups to store job in the database longer (365 days) if keep_results_in_days was reduced (to be below 365 days).

Note that if we merge this as-is we can still easily keep jobs longer in the database for all groups afterwards. Just bumping KEEP_JOBS_IN_DAYS won't be enough but with a simple database query we can make these groups use the default again:

UPDATE job_groups SET keep_jobs_in_days = null WHERE keep_jobs_in_days <= 720 and keep_jobs_in_days != 0;
UPDATE job_group_parents SET default_keep_jobs_in_days = null WHERE default_keep_jobs_in_days <= 720 and default_keep_jobs_in_days != 0;

Sounds like a plan. I will approve

@mergify mergify bot merged commit c8a5c81 into os-autoinst:master Jul 29, 2025
51 checks passed
@Martchus Martchus deleted the jobs-db-only branch July 29, 2025 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants