-
Notifications
You must be signed in to change notification settings - Fork 774
Remove docs and unit tests surrounding running agent as non-root #8931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Michael Montgomery <[email protected]>
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
Signed-off-by: Michael Montgomery <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request removes obsolete documentation and testing infrastructure for running Elastic Agent as non-root, as this functionality is now natively supported in Elastic Agent v8.16.0 and later without requiring special DaemonSet configuration.
Key Changes:
- Removed the dedicated non-root recipe file and its associated end-to-end test
- Simplified test helper code by removing non-root specific DaemonSet mutation logic
- Updated main recipe file to document that
runAsUser: 0is no longer required since v8.16.0
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e/test/helper/yaml.go | Removed maybeMutateForAgentNonRootTests function and fleet outputs configuration logic that was only needed for non-root testing; simplified DaemonSet handling |
| test/e2e/agent/recipes_test.go | Removed TestFleetKubernetesNonRootIntegrationRecipe test function that validated the non-root configuration |
| config/recipes/elastic-agent/fleet-kubernetes-integration.yaml | Added explanatory comments indicating that runAsUser: 0 is no longer needed since v8.16.0 |
| config/recipes/elastic-agent/fleet-kubernetes-integration-nonroot.yaml | Deleted entire recipe file that demonstrated non-root setup with DaemonSet for permission management |
| config/recipes/elastic-agent/README.asciidoc | Removed documentation section about the non-root integration recipe |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review for a chance to win a $100 gift card. Take the survey.
Signed-off-by: Michael Montgomery <[email protected]>
Signed-off-by: Michael Montgomery <[email protected]>
pkoutsovasilis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this cleanup @naemono!
Just want to confirm my understanding; elastic-agent >= 8.16 does have the capability to run as non-root, but for this to be effective there are some requirements:
- The agent container needs
CAP_SETPCAPandCAP_CHOWNcapabilities - these are enabled by default in all major container runtimes but a user can/may drop them. securityContext.allowPrivilegeEscalationshould betrue- IIRC this is also the default for most container runtimes but still a user can/may set this asfalse.
This raises a few questions in my head:
- Should the operator explicitly set these capabilities? Even though they're defaults in most runtimes, should we be explicit about requiring them rather than relying on runtime defaults? This would make the requirements clearer and prevent issues if someone is running with a more restrictive runtime configuration.
- Init container pattern vs continuous elevated capabilities: I'm trying to think through the security implications. Even though we can run as non-root with >= 8.16, we still need to run the agent container continuously with elevated capabilities. I'm wondering if there's still value in the init container approach for some users - where only the init container runs with elevated capabilities to set up permissions, and then everything is dropped from the main agent container.
- How does this work with OpenShift? OCP has stricter default Security Context Constraints. Will the agent work out of the box on OCP with these capability requirements, or will users need to create custom SCCs?
Do you think there are security-conscious users who would prefer the init container pattern (short-lived elevated privileges) over a continuously running agent container with CAP_NET_RAW and CAP_CHOWN? Or am I overthinking this? WDYT @naemono?
I think this is a valid point and I think the init container pattern has indeed advantages over running the main container with elevated privileges. You mentioned that |
I don't disagree with what is being said here. I struggled with how to present this such that is was "clear" to the user the approaches one could take with respect to Agent without making it a convoluted mess. Unfortunately with the following points
Things become even more difficult to document for Agent. I see some options:
Any preferences for an approach to try and bring this documentation up to current @pebrc @pkoutsovasilis ? |
Since v8.16 Elastic Agent doesn't need the daemonset to adjust directory permissions when running as non-root. This removes/adjusts the documentation to make this more clear. It also removes the e2e test.
Testing
fleet-kubernetes-integration.yamlin thedefaultnamespace and everything came up without issues (see screenshot)This will be paired with a documentation PR as well, fyi.