-
Notifications
You must be signed in to change notification settings - Fork 961
enable mutation detection in e2e #6541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
enable mutation detection in e2e #6541
Conversation
Signed-off-by: changzhen <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary of Changes
Hello @XiShanYongYe-Chang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request focuses on enhancing the stability and integrity of the Karmada control plane by enabling the MutationDetector
within the karmada-controller-manager
for e2e tests. This is a crucial step to prevent unintended modifications to cached objects, especially following a previous change that disabled deepcopy operations during list operations, thereby safeguarding against future cache mutation issues.
Highlights
- Enable Cache Mutation Detection: I've updated the
karmada-controller-manager
deployment manifest (artifacts/deploy/karmada-controller-manager.yaml
) to include a new environment variable,KUBE_CACHE_MUTATION_DETECTOR
. This variable will control whether the cache mutation detector is active. - Automate Detector Activation in E2E: I've modified the
hack/deploy-karmada.sh
script to set theKUBE_CACHE_MUTATION_DETECTOR
environment variable totrue
by default. This script now also dynamically injects this value into thekarmada-controller-manager.yaml
during deployment, ensuring the mutation detector is enabled for e2e tests.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request enables the client-go cache mutation detector for the karmada-controller-manager
in E2E tests. The implementation correctly uses the KUBE_CACHE_MUTATION_DETECTOR
environment variable. A suggestion has been made to improve the deployment script by avoiding in-place modification of source files.
@@ -449,6 +453,9 @@ sed -i'' -e "s/{{ca_crt}}/${karmada_ca}/g" "${TEMP_PATH_BOOTSTRAP}"/bootstrap-to | |||
sed -i'' -e "s|{{apiserver_address}}|${karmada_apiserver_address}|g" "${TEMP_PATH_BOOTSTRAP}"/bootstrap-token-configuration-tmp.yaml | |||
kubectl --context="karmada-apiserver" apply -f "${TEMP_PATH_BOOTSTRAP}"/bootstrap-token-configuration-tmp.yaml | |||
|
|||
# render the KUBE_CACHE_MUTATION_DETECTOR in the karmada-controller-manager | |||
sed -i'' -e "s/{{KUBE_CACHE_MUTATION_DETECTOR}}/${KUBE_CACHE_MUTATION_DETECTOR}/g" "${REPO_ROOT}/artifacts/deploy/karmada-controller-manager.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modifying the file in-place can lead to a dirty git working tree and potential issues if the script is interrupted. Copy the YAML file to a temporary location, modify the copy, and then apply it to ensure the source files remain untouched.
sed -i'' -e "s/{{KUBE_CACHE_MUTATION_DETECTOR}}/${KUBE_CACHE_MUTATION_DETECTOR}/g" "${REPO_ROOT}/artifacts/deploy/karmada-controller-manager.yaml" | |
CONTROLLER_MANAGER_YAML_TMP="${TEMP_PATH_BOOTSTRAP}/karmada-controller-manager.yaml" | |
cp "${REPO_ROOT}/artifacts/deploy/karmada-controller-manager.yaml" "${CONTROLLER_MANAGER_YAML_TMP}" | |
sed -i'' -e "s/{{KUBE_CACHE_MUTATION_DETECTOR}}/${KUBE_CACHE_MUTATION_DETECTOR}/g" "${CONTROLLER_MANAGER_YAML_TMP}" | |
kubectl --context="${HOST_CLUSTER_NAME}" apply -f "${CONTROLLER_MANAGER_YAML_TMP}" |
Codecov ReportAll modified and coverable lines are covered by tests ✅
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## master #6541 +/- ##
=======================================
Coverage 45.43% 45.43%
=======================================
Files 687 687
Lines 56318 56334 +16
=======================================
+ Hits 25587 25598 +11
- Misses 29132 29138 +6
+ Partials 1599 1598 -1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Any documentation for the Mutation Detector feature? |
I have not yet found the relevant functional introduction document. |
During the execution of this test, it was discovered that the controller-manager woul paninnc. Here is an example of an error log:
|
Logs from the |
Yes, it is. Unfortunately, the panic log does not directly indicate which specific line of code caused the issue. |
@XiShanYongYe-Chang While resolving #6513, I discovered that the detector performs an operation that modifies the cache. I will submit a PR to address this issue. And check whether it can pass this mutation detection after the fix. |
@XiShanYongYe-Chang In #6544, I mitigated the behavior that mutates the informer cache, and it appears from the logs that this issue has been resolved. However, I have some doubts about this: |
Your question is excellent, and it's something I've been pondering as well. I have two thoughts:
For the first point, perhaps we could introduce a check for component restarts, but as for the second point, I don't have any ideas at the moment. |
When a pod restarts, there are some indicators that can be used to identify the reason for the restart. For example, the Last State of the container. When a pod restarts due to panic, it has the following characteristics:
Although these characteristics are not unique to go panic, they can distinguish between oom and other types of restarts, reducing the risk of false positives. $ kubectl describe pods --namespace karmada-system karmada-controller-manager-7b74766c6f-qlw72
Containers:
karmada-controller-manager:
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Thu, 17 Jul 2025 16:36:54 +0800
Finished: Thu, 17 Jul 2025 16:38:15 +0800
Ready: True
Restart Count: 1 |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
Since we have disabled deepcopy during list operations in #5813 , we should enable the MutationDetector in our e2e tests to prevent any future code from mutating the cache.
Which issue(s) this PR fixes:
Part of #6516
Special notes for your reviewer:
Does this PR introduce a user-facing change?: