|
| 1 | +# Functional E2E Tests for Global Value Inheritance |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This directory contains functional end-to-end (E2E) tests that validate global value inheritance for the newrelic-logging chart in a real Kubernetes environment. Unlike helm-unittest tests (which only validate template rendering), these tests deploy the chart to a Kubernetes cluster and verify that configuration values actually reach the running containers. |
| 6 | + |
| 7 | +**Important**: These tests do NOT validate telemetry (whether data reaches New Relic). They only validate functional configuration propagation. For telemetry validation, see the main repository E2E test suite using `newrelic-integration-e2e-action`. |
| 8 | + |
| 9 | +## What These Tests Validate |
| 10 | + |
| 11 | +The test suite validates that the following global values propagate correctly to the newrelic-logging DaemonSet: |
| 12 | + |
| 13 | +1. **Proxy Configuration** (`global.proxy`) |
| 14 | + - HTTP_PROXY and HTTPS_PROXY environment variables are set in Fluent Bit containers |
| 15 | + - Local `proxy` value overrides `global.proxy` (precedence validation) |
| 16 | + |
| 17 | +2. **NodeSelector** (`global.nodeSelector`) |
| 18 | + - Global node selection constraints apply to DaemonSet pods |
| 19 | + - Pods schedule only on nodes matching the selector |
| 20 | + |
| 21 | +3. **Tolerations** (`global.tolerations`) |
| 22 | + - Global tolerations allow pods to tolerate node taints |
| 23 | + - Pods can schedule on tainted nodes |
| 24 | + |
| 25 | +4. **Custom Registry** (`global.images.registry`) |
| 26 | + - Container images pull from custom registry |
| 27 | + - Image paths include the custom registry domain |
| 28 | + |
| 29 | +5. **ServiceAccount Annotations** (`global.serviceAccount.annotations`) |
| 30 | + - Annotations (e.g., for IRSA, Workload Identity) propagate to ServiceAccount |
| 31 | + - Critical for cloud IAM role bindings |
| 32 | + |
| 33 | +6. **Verbose Logging** (`global.verboseLog`) |
| 34 | + - `global.verboseLog: true` maps to `LOG_LEVEL=debug` in containers |
| 35 | + - Enables debug logging across all components |
| 36 | + |
| 37 | +7. **Host Network** (`global.hostNetwork`) |
| 38 | + - `global.hostNetwork: true` enables host network mode for DaemonSet pods |
| 39 | + - Required for certain networking configurations |
| 40 | + |
| 41 | +## Test Approach and Limitations |
| 42 | + |
| 43 | +### What These Tests Validate |
| 44 | + |
| 45 | +These tests validate **configuration propagation** - they verify that Helm values reach the Kubernetes pod specifications correctly. The tests: |
| 46 | + |
| 47 | +- ✅ Deploy the chart using Helm with specific global values |
| 48 | +- ✅ Wait for pod objects to be created in Kubernetes |
| 49 | +- ✅ Inspect pod specifications to verify configuration is present |
| 50 | +- ✅ Validate environment variables, nodeSelectors, tolerations, etc. |
| 51 | + |
| 52 | +### What These Tests Do NOT Validate |
| 53 | + |
| 54 | +These tests do NOT validate runtime behavior: |
| 55 | + |
| 56 | +- ❌ Pods do NOT need to become "Ready" (tests don't wait for readiness) |
| 57 | +- ❌ Fluent Bit does NOT need to actually run |
| 58 | +- ❌ No telemetry validation (data reaching New Relic) |
| 59 | +- ❌ No functional testing of log collection |
| 60 | + |
| 61 | +### Rationale |
| 62 | + |
| 63 | +This approach allows fast, lightweight testing of configuration inheritance without requiring: |
| 64 | +- Working container images in the test environment |
| 65 | +- New Relic account credentials |
| 66 | +- Long test execution times (tests complete in ~2-3 minutes vs ~10+ minutes) |
| 67 | + |
| 68 | +The tests use lightweight test images (or image pull policy `Never`) to speed up execution and reduce dependencies. Since we're only validating pod specifications (which Kubernetes populates immediately), we don't need the containers to successfully start. |
| 69 | + |
| 70 | +**For runtime and telemetry validation**, see the main repository E2E test suite using `newrelic-integration-e2e-action`. |
| 71 | + |
| 72 | +## Prerequisites |
| 73 | + |
| 74 | +### Required Tools |
| 75 | + |
| 76 | +- **kubectl**: Kubernetes command-line tool |
| 77 | +- **helm**: Helm 3.x |
| 78 | +- **Kubernetes cluster**: Minikube, KIND, or any Kubernetes cluster |
| 79 | + |
| 80 | +### Cluster Requirements |
| 81 | + |
| 82 | +- Cluster must have at least 1 node |
| 83 | +- User must have permissions to: |
| 84 | + - Deploy DaemonSets and ServiceAccounts |
| 85 | + - Label nodes |
| 86 | + - Taint nodes |
| 87 | + - Create/delete Helm releases |
| 88 | + |
| 89 | +### Test Environment Setup |
| 90 | + |
| 91 | +The tests are designed to run on local Kubernetes clusters (Minikube, KIND, k3d). Example setup with k3d: |
| 92 | + |
| 93 | +```bash |
| 94 | +# Install k3d (lightweight Kubernetes) |
| 95 | +brew install k3d # macOS |
| 96 | +# OR |
| 97 | +# curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash |
| 98 | + |
| 99 | +# Create a local cluster |
| 100 | +k3d cluster create demo |
| 101 | + |
| 102 | +# Verify cluster is running |
| 103 | +kubectl get nodes |
| 104 | + |
| 105 | +# Build lightweight test image |
| 106 | +docker build -t e2e/newrelic-fluentbit-output:test -f Dockerfile.test . |
| 107 | + |
| 108 | +# Load image into k3d cluster |
| 109 | +k3d image import e2e/newrelic-fluentbit-output:test -c demo |
| 110 | +``` |
| 111 | + |
| 112 | +**Note**: The test image can be any lightweight image (busybox, alpine) since tests only validate pod specifications, not runtime behavior. The chart is configured with `image.pullPolicy=Never` to use locally loaded images. |
| 113 | + |
| 114 | +## Running the Tests |
| 115 | + |
| 116 | +### Run All Tests |
| 117 | + |
| 118 | +```bash |
| 119 | +cd charts/newrelic-logging/tests/functional-e2e |
| 120 | +./test-global-values.sh |
| 121 | +``` |
| 122 | + |
| 123 | +### Test Output |
| 124 | + |
| 125 | +The test script provides colored output: |
| 126 | +- 🟢 **[INFO]**: General information |
| 127 | +- 🟡 **[WARN]**: Warnings (non-fatal) |
| 128 | +- 🔴 **[ERROR]**: Errors (fatal) |
| 129 | +- ✓ **PASS**: Test passed |
| 130 | +- ✗ **FAIL**: Test failed |
| 131 | + |
| 132 | +Example output: |
| 133 | + |
| 134 | +``` |
| 135 | +[INFO] Starting functional E2E tests for global value inheritance |
| 136 | +[INFO] Using kubectl context: minikube |
| 137 | +[INFO] Test 1: Proxy configuration propagation |
| 138 | +[INFO] Installing chart with global.proxy... |
| 139 | +[INFO] Waiting for DaemonSet to be ready (timeout: 120s)... |
| 140 | +[INFO] DaemonSet ready: 1/1 pods |
| 141 | +[INFO] ✓ PASS: HTTP_PROXY environment variable set correctly |
| 142 | +[INFO] ✓ PASS: HTTPS_PROXY environment variable set correctly |
| 143 | +... |
| 144 | +======================================== |
| 145 | +Test Summary |
| 146 | +======================================== |
| 147 | +Tests run: 8 |
| 148 | +Tests passed: 8 |
| 149 | +Tests failed: 0 |
| 150 | +======================================== |
| 151 | +[INFO] All tests passed! |
| 152 | +``` |
| 153 | + |
| 154 | +### Test Duration |
| 155 | + |
| 156 | +- **Per test**: ~30-60 seconds (Helm install + DaemonSet ready wait) |
| 157 | +- **Total suite**: ~5-8 minutes |
| 158 | + |
| 159 | +## Test Details |
| 160 | + |
| 161 | +### Test 1: Proxy Configuration Propagation |
| 162 | + |
| 163 | +**What it tests**: `global.proxy` → `HTTP_PROXY`/`HTTPS_PROXY` environment variables |
| 164 | + |
| 165 | +**Steps**: |
| 166 | +1. Install chart with `global.proxy: "http://test-proxy.example.com:3128"` |
| 167 | +2. Wait for DaemonSet to be ready |
| 168 | +3. Inspect pod spec for HTTP_PROXY and HTTPS_PROXY env vars |
| 169 | +4. Validate values match expected proxy URL |
| 170 | + |
| 171 | +**Why it matters**: Corporate environments require proxy configuration for outbound connections. |
| 172 | + |
| 173 | +--- |
| 174 | + |
| 175 | +### Test 2: Proxy Override |
| 176 | + |
| 177 | +**What it tests**: Local `proxy` value takes precedence over `global.proxy` |
| 178 | + |
| 179 | +**Steps**: |
| 180 | +1. Install chart with both `global.proxy` and `proxy` set |
| 181 | +2. Verify HTTP_PROXY uses local `proxy` value (not global) |
| 182 | + |
| 183 | +**Why it matters**: Validates precedence model (local > global > default). |
| 184 | + |
| 185 | +--- |
| 186 | + |
| 187 | +### Test 3: NodeSelector Propagation |
| 188 | + |
| 189 | +**What it tests**: `global.nodeSelector` applies to DaemonSet pods |
| 190 | + |
| 191 | +**Steps**: |
| 192 | +1. Label cluster node with `test-label=true` |
| 193 | +2. Install chart with `global.nodeSelector.test-label: "true"` |
| 194 | +3. Verify pod has nodeSelector field with correct label |
| 195 | +4. Cleanup: Remove node label |
| 196 | + |
| 197 | +**Why it matters**: Enables node targeting for dedicated monitoring nodes. |
| 198 | + |
| 199 | +--- |
| 200 | + |
| 201 | +### Test 4: Tolerations Propagation |
| 202 | + |
| 203 | +**What it tests**: `global.tolerations` allows pods to tolerate node taints |
| 204 | + |
| 205 | +**Steps**: |
| 206 | +1. Taint cluster node with `test-taint=true:NoSchedule` |
| 207 | +2. Install chart with `global.tolerations` matching taint |
| 208 | +3. Verify pod schedules despite taint (has toleration in spec) |
| 209 | +4. Cleanup: Remove node taint |
| 210 | + |
| 211 | +**Why it matters**: Enables deployment on tainted nodes (e.g., monitoring-dedicated nodes). |
| 212 | + |
| 213 | +--- |
| 214 | + |
| 215 | +### Test 5: Custom Registry Propagation |
| 216 | + |
| 217 | +**What it tests**: `global.images.registry` changes container image paths |
| 218 | + |
| 219 | +**Steps**: |
| 220 | +1. Install chart with `global.images.registry: "custom-registry.example.com"` |
| 221 | +2. Inspect pod spec for container image path |
| 222 | +3. Verify image path includes custom registry |
| 223 | + |
| 224 | +**Why it matters**: Critical for air-gapped environments and private registries. |
| 225 | + |
| 226 | +--- |
| 227 | + |
| 228 | +### Test 6: ServiceAccount Annotations Propagation |
| 229 | + |
| 230 | +**What it tests**: `global.serviceAccount.annotations` propagate to ServiceAccount |
| 231 | + |
| 232 | +**Steps**: |
| 233 | +1. Install chart with `global.serviceAccount.annotations.eks.amazonaws.com/role-arn` |
| 234 | +2. Inspect ServiceAccount for annotation |
| 235 | +3. Verify annotation value matches expected |
| 236 | + |
| 237 | +**Why it matters**: Required for IAM roles (AWS IRSA, GCP Workload Identity, Azure Pod Identity). |
| 238 | + |
| 239 | +--- |
| 240 | + |
| 241 | +### Test 7: VerboseLog Propagation |
| 242 | + |
| 243 | +**What it tests**: `global.verboseLog: true` → `LOG_LEVEL=debug` |
| 244 | + |
| 245 | +**Steps**: |
| 246 | +1. Install chart with `global.verboseLog: true` |
| 247 | +2. Inspect pod spec for LOG_LEVEL env var |
| 248 | +3. Verify value is "debug" (not default "info") |
| 249 | + |
| 250 | +**Why it matters**: Enables debug logging for troubleshooting. |
| 251 | + |
| 252 | +--- |
| 253 | + |
| 254 | +### Test 8: HostNetwork Propagation |
| 255 | + |
| 256 | +**What it tests**: `global.hostNetwork: true` enables host network mode |
| 257 | + |
| 258 | +**Steps**: |
| 259 | +1. Install chart with `global.hostNetwork: true` |
| 260 | +2. Inspect pod spec for hostNetwork field |
| 261 | +3. Verify field is true |
| 262 | + |
| 263 | +**Why it matters**: Required for certain networking configurations (e.g., UDP log forwarding). |
| 264 | + |
| 265 | +--- |
| 266 | + |
| 267 | +## Troubleshooting |
| 268 | + |
| 269 | +### Test Failures |
| 270 | + |
| 271 | +#### "DaemonSet did not become ready within 120s" |
| 272 | + |
| 273 | +**Cause**: Image not available, insufficient resources, or scheduling constraints preventing pod from running. |
| 274 | + |
| 275 | +**Solutions**: |
| 276 | +1. Verify image is loaded into Minikube: `minikube image ls | grep newrelic-fluentbit` |
| 277 | +2. Check pod events: `kubectl describe pod <pod-name>` |
| 278 | +3. Check pod logs: `kubectl logs <pod-name>` |
| 279 | +4. Verify cluster has resources: `kubectl top nodes` |
| 280 | + |
| 281 | +#### "NodeSelector test failing" |
| 282 | + |
| 283 | +**Cause**: Node label not applied or multiple nodes present. |
| 284 | + |
| 285 | +**Solutions**: |
| 286 | +1. Verify node has label: `kubectl get nodes --show-labels` |
| 287 | +2. If multiple nodes, ensure label is on all nodes or adjust test |
| 288 | + |
| 289 | +#### "Tolerations test failing" |
| 290 | + |
| 291 | +**Cause**: Taint not applied or pod doesn't have toleration. |
| 292 | + |
| 293 | +**Solutions**: |
| 294 | +1. Verify node has taint: `kubectl describe node <node-name> | grep Taints` |
| 295 | +2. Check pod tolerations: `kubectl get pod <pod-name> -o yaml | grep -A 10 tolerations` |
| 296 | + |
| 297 | +### Cleanup Issues |
| 298 | + |
| 299 | +If tests fail midway, you may need to manually clean up: |
| 300 | + |
| 301 | +```bash |
| 302 | +# Delete Helm release |
| 303 | +helm delete nr-logging-e2e --namespace default |
| 304 | + |
| 305 | +# Remove node labels |
| 306 | +kubectl label node --all test-label- |
| 307 | + |
| 308 | +# Remove node taints |
| 309 | +kubectl taint node --all test-taint- |
| 310 | + |
| 311 | +# Delete test pods |
| 312 | +kubectl delete pod test-pod-for-exec --namespace default |
| 313 | +``` |
| 314 | + |
| 315 | +## Integration with CI/CD |
| 316 | + |
| 317 | +These tests are designed to run in GitHub Actions workflows. See `.github/workflows/functional-e2e.yaml` for the workflow configuration. |
| 318 | + |
| 319 | +### Workflow Triggers |
| 320 | + |
| 321 | +- **On PR**: Run against all supported Kubernetes versions |
| 322 | +- **On Push to main**: Run regression tests |
| 323 | +- **Manual**: Via workflow_dispatch |
| 324 | + |
| 325 | +### Kubernetes Version Matrix |
| 326 | + |
| 327 | +Tests run against: |
| 328 | +- v1.34.0 |
| 329 | +- v1.33.0 |
| 330 | +- v1.32.0 |
| 331 | +- v1.31.0 |
| 332 | +- v1.30.0 |
| 333 | + |
| 334 | +## Limitations |
| 335 | + |
| 336 | +### What These Tests Don't Cover |
| 337 | + |
| 338 | +1. **Telemetry Validation**: Tests don't verify that logs reach New Relic (use `newrelic-integration-e2e-action` for that) |
| 339 | +2. **Fluent Bit Functionality**: Tests don't verify log parsing, filtering, or forwarding logic |
| 340 | +3. **Multi-Node Scheduling**: Tests assume single-node cluster (Minikube) |
| 341 | +4. **Windows DaemonSet**: Tests only cover Linux DaemonSet (Windows requires Windows node) |
| 342 | +5. **Performance**: No load testing or resource usage validation |
| 343 | + |
| 344 | +### Known Issues |
| 345 | + |
| 346 | +- **Image Loading**: Tests require pre-built image loaded into Minikube (not pulled from registry) |
| 347 | +- **Single Node**: Tests may not catch multi-node scheduling issues |
| 348 | +- **No Proxy Server**: Tests validate proxy env vars but don't test actual proxy connectivity |
| 349 | + |
| 350 | +## Future Enhancements |
| 351 | + |
| 352 | +1. **Multi-Chart Tests**: Validate global values across multiple charts in nri-bundle |
| 353 | +2. **Real Proxy Test**: Deploy Squid proxy in Minikube and validate connectivity |
| 354 | +3. **Air-Gapped Registry Test**: Deploy local registry and validate image pull |
| 355 | +4. **Windows Support**: Add Windows node to Minikube and test Windows DaemonSet |
| 356 | +5. **Fargate Exclusion Test**: Validate that affinity rules exclude Fargate nodes |
| 357 | +6. **Priority Class Test**: Validate `global.priorityClassName` propagation |
| 358 | +7. **Affinity Test**: Validate `global.affinity` propagation |
| 359 | + |
| 360 | +## Related Documentation |
| 361 | + |
| 362 | +- [Helm Unit Tests](../README.md): Template-level tests using helm-unittest |
| 363 | +- [E2E Telemetry Tests](../../../../e2e/README.md): Telemetry validation tests (if available) |
| 364 | +- [Global Values Specification](../../../../CLAUDE.md): Complete list of 27 global values |
| 365 | + |
| 366 | +## Support |
| 367 | + |
| 368 | +For issues or questions: |
| 369 | +- **GitHub Issues**: [newrelic/helm-charts/issues](https://github.com/newrelic/helm-charts/issues) |
| 370 | +- **Pull Requests**: Contributions welcome! |
| 371 | +- **Documentation**: [New Relic Kubernetes Integration Docs](https://docs.newrelic.com/docs/kubernetes-pixie/kubernetes-integration/get-started/introduction-kubernetes-integration/) |
0 commit comments