-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running argocd sync app --dry-run
commands affect metrics
#21899
Comments
I am trying to get more exposure to argocd codebase and develop some understanding. It looks like an easy fix if it is just to add a label. |
Thanks mate, I ve just create this MR to fix it. Let me know wdyt |
Have you investigated why the sync status changes for dry runs? This seems more like it should be fixed by not modifying the persisted sync status on dry runs rather than adding it to the metrics. |
@agaudreault - That is something that can be investigated in parallel but should not affect this PR. Metrics should have an indicator if they were generated via dry run or a normal operation. |
Taking in count this comment not sure if this is the way to go.
The idea is to split this issue in two different ones:
Wdyt? Want to know more about how repo server persists app status, any idea where to look? |
I have been looking into this today. Shouldn't this be somewhere in controller/sync.go? I am happy to add more stuff to it if I am even remotely on the right track here. Also, just to clarify what happens here is during dry run sync, it is that updates the application sync state incorrectly right? |
I agree that it's weird to update the sync state for a dry run... but if we don't do that, does the controller have a way to communicate the success/failure of the dry run up to the UI/CLI? If not, maybe instead of not persisting the state, we just need to persist it to a different field so that we preserve both the real sync state and the dry run sync state? |
I was just trying to reproduce this in my local with argo-cd version 2.14.2. The full behaviour that I can see is - I kind of agree with @crenshaw-dev here. Looking at the code, if we don't update anything, we won't be able to return the result back correctly. So we need to handle the results of dry run sync in a different manner |
I think @jsolana's PR probably solves the metrics problem. But yeah, a more general solution would probably involve CRD, controller, API, CLI, and UI work. |
@crenshaw-dev - Any idea on where to start? |
I think we need to start by enumerating the user experiences that would be impacted by not updating the sync status field, whether those experiences are via the UI, CLI, or API. Then we need to figure out how to rebuild those using a new dry run sync status field. |
I'm wondering if this is more of a UI feedback issue—where "Last Sync" isn't clearly communicated—rather than a fundamental change in how Sync and DryRun are handled internally. Perhaps it would be useful if the UI included information on whether the "Last Sync" operation was a DryRun or not. Related with #22059
For us this is the "most urgent" issue cause impact developers experience with false positives. |
@jsolana agreed, adding clarifying information about whether it was a dry run might be a sufficient fix with relatively light CR/code changes. |
Thanks to @agaudreault , I now have a much clearer idea of the path to follow (in theory 😅 ):
Make sense? |
Hi! Do you think // SyncOperationResult represent result of sync operation
type SyncOperationResult struct {
// Resources contains a list of sync result items for each individual resource in a sync operation
Resources ResourceResults `json:"resources,omitempty" protobuf:"bytes,1,opt,name=resources"`
// Revision holds the revision this sync operation was performed to
Revision string `json:"revision" protobuf:"bytes,2,opt,name=revision"`
// Source records the application source information of the sync, used for comparing auto-sync
Source ApplicationSource `json:"source,omitempty" protobuf:"bytes,3,opt,name=source"`
// Source records the application source information of the sync, used for comparing auto-sync
Sources ApplicationSources `json:"sources,omitempty" protobuf:"bytes,4,opt,name=sources"`
// Revisions holds the revision this sync operation was performed for respective indexed source in sources field
Revisions []string `json:"revisions,omitempty" protobuf:"bytes,5,opt,name=revisions"`
// ManagedNamespaceMetadata contains the current sync state of managed namespace metadata
ManagedNamespaceMetadata *ManagedNamespaceMetadata `json:"managedNamespaceMetadata,omitempty" protobuf:"bytes,6,opt,name=managedNamespaceMetadata"`
// If true, the sync will be performed as a dry-run without actually applying any changes.
DryRun bool `json:"dryRun,omitempty" protobuf:"bytes,3,opt,name=dryRun"`
} Thanks! |
My recommendation would be to persist any dry run result in a completely different field (e.g. |
Currently, the SyncResult is persisted in the OperationState. Knowing that there can only be 1 operation in progress at the same time, there should only be 1 OperationState. The problem is that the The way I see to solve this is to have 1 new fields
Most code stays unaffected because the behavior of
We could also persist the last dry-run operation in the status, but I see little value to do that. |
I think the prohibition of concurrent dry-run and non-dry-run operations is a matter of implementation, not API. I could definitely imagine breaking the dry-run operations into their own work queue in the future. The UI/CLI currently just pull the latest operation state. Having to toggle between lastCompletedNonDryRunOperation and operationState complicates the client code. Seems like it would be nicer to just fetch the state they want: dry run or non-dry run. |
Hi @agaudreault @crenshaw-dev , In case you can take a look to this PR to confirm I understand properly the proposal Thanks |
Checklist:
argocd version
.Describe the bug
Related to #21661
Currently, executing the command
argocd app sync --dry-run
affects both the application’s state and the internal metrics exposed by ArgoCD (eg:argocd_app_info
).The main issue is that if there are alerts based on these metrics, and the dry-run execution identifies an error (e.g., a change that violates a Kyverno policy or an invalid CRD schema), the application state changes to
SyncErr
. This also updates the metrics, which can potentially trigger alerts based on these metrics.For example:
Since the definition of a
dry-run
is to execute requests without persistence, I wonder if it makes sense to handle it in a way that ensures no changes are made.Alternatively, adding a dry_run label to differentiate operational requests from dry-run requests could also be an option (a similar change has been proposed for kyverno link).
To Reproduce
Run
argocd app sync --dry-run
command.Expected behavior
There are different proposals:
dryrun
label to distinguish dryrun activity from real ones.dryrun
executions (not affecting metrics).Initially I fond of the 1 because dryrun has a cost associated in terms of resources consumption / performance. Ignoring activity related to dryrun make extremely hard to identify the reason of performance issues.
Version
It is affecting whatever version because currently dryrun executions are not distinguished.
The text was updated successfully, but these errors were encountered: