Ensure added taints are present on the node in the snapshot #7820
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What type of PR is this?
/kind bug
What this PR does / why we need it:
There's an inconsistency in
AddTaints
behavior.AddTaints
makes a copy of the node, the taints are added to the spec of the copy, but not to the original spec. MeanwhileHasTaints
checks if the taint is present in the spec of the node. Therefore, the changes made by AddTaints won't be present in the snapshot until the next autoscaler loop.This PR ensures that the new taints are also added to the spec of the node in the snapshot. I also had to fix static autoscaler test, as it was configured improperly. As k8s API client was mocked, in the previous implementation marking the node as deleted was no-op in this test. Therefore, the unschedulable pod was scheduled to the removed node, which would not happen in the real scenario, as the node would refresh its taints in the beginning of the autoscaler loop. As
AddTaints
now adds taints also to the node spec, that unschedulable pod was triggering scale up, which wasn't expected by the mock. As scale up logic has already been tested in previous run, the fix was to simply remove that pod from the unschedulable pods list. Couple of other static autoscaler test cases also failed because they depended on this bugDoes this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: