Skip to content

Commit dea0fcb

Browse files
committed
fix(newrelic-logging): correct logLevel default to enable global.verboseLog inheritance
- Changed fluentBit.logLevel default from "info" to "" (empty) - This allows global.verboseLog=true to correctly set LOG_LEVEL=debug - Added clear precedence comments in values.yaml - Restored functional E2E test files (test-global-values.sh, README.md) Root cause: values.yaml had logLevel: "info" as default, causing the template's if $logLevel condition to always be true, preventing global.verboseLog from being evaluated. Test Results: All 107 helm-unittest tests pass Template validation: - Default (no settings) → LOG_LEVEL="info" - global.verboseLog=true → LOG_LEVEL="debug" - Explicit fluentBit.logLevel → takes precedence
1 parent 175ddd8 commit dea0fcb

File tree

3 files changed

+827
-1
lines changed

3 files changed

+827
-1
lines changed
Lines changed: 371 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,371 @@
1+
# Functional E2E Tests for Global Value Inheritance
2+
3+
## Overview
4+
5+
This directory contains functional end-to-end (E2E) tests that validate global value inheritance for the newrelic-logging chart in a real Kubernetes environment. Unlike helm-unittest tests (which only validate template rendering), these tests deploy the chart to a Kubernetes cluster and verify that configuration values actually reach the running containers.
6+
7+
**Important**: These tests do NOT validate telemetry (whether data reaches New Relic). They only validate functional configuration propagation. For telemetry validation, see the main repository E2E test suite using `newrelic-integration-e2e-action`.
8+
9+
## What These Tests Validate
10+
11+
The test suite validates that the following global values propagate correctly to the newrelic-logging DaemonSet:
12+
13+
1. **Proxy Configuration** (`global.proxy`)
14+
- HTTP_PROXY and HTTPS_PROXY environment variables are set in Fluent Bit containers
15+
- Local `proxy` value overrides `global.proxy` (precedence validation)
16+
17+
2. **NodeSelector** (`global.nodeSelector`)
18+
- Global node selection constraints apply to DaemonSet pods
19+
- Pods schedule only on nodes matching the selector
20+
21+
3. **Tolerations** (`global.tolerations`)
22+
- Global tolerations allow pods to tolerate node taints
23+
- Pods can schedule on tainted nodes
24+
25+
4. **Custom Registry** (`global.images.registry`)
26+
- Container images pull from custom registry
27+
- Image paths include the custom registry domain
28+
29+
5. **ServiceAccount Annotations** (`global.serviceAccount.annotations`)
30+
- Annotations (e.g., for IRSA, Workload Identity) propagate to ServiceAccount
31+
- Critical for cloud IAM role bindings
32+
33+
6. **Verbose Logging** (`global.verboseLog`)
34+
- `global.verboseLog: true` maps to `LOG_LEVEL=debug` in containers
35+
- Enables debug logging across all components
36+
37+
7. **Host Network** (`global.hostNetwork`)
38+
- `global.hostNetwork: true` enables host network mode for DaemonSet pods
39+
- Required for certain networking configurations
40+
41+
## Test Approach and Limitations
42+
43+
### What These Tests Validate
44+
45+
These tests validate **configuration propagation** - they verify that Helm values reach the Kubernetes pod specifications correctly. The tests:
46+
47+
- ✅ Deploy the chart using Helm with specific global values
48+
- ✅ Wait for pod objects to be created in Kubernetes
49+
- ✅ Inspect pod specifications to verify configuration is present
50+
- ✅ Validate environment variables, nodeSelectors, tolerations, etc.
51+
52+
### What These Tests Do NOT Validate
53+
54+
These tests do NOT validate runtime behavior:
55+
56+
- ❌ Pods do NOT need to become "Ready" (tests don't wait for readiness)
57+
- ❌ Fluent Bit does NOT need to actually run
58+
- ❌ No telemetry validation (data reaching New Relic)
59+
- ❌ No functional testing of log collection
60+
61+
### Rationale
62+
63+
This approach allows fast, lightweight testing of configuration inheritance without requiring:
64+
- Working container images in the test environment
65+
- New Relic account credentials
66+
- Long test execution times (tests complete in ~2-3 minutes vs ~10+ minutes)
67+
68+
The tests use lightweight test images (or image pull policy `Never`) to speed up execution and reduce dependencies. Since we're only validating pod specifications (which Kubernetes populates immediately), we don't need the containers to successfully start.
69+
70+
**For runtime and telemetry validation**, see the main repository E2E test suite using `newrelic-integration-e2e-action`.
71+
72+
## Prerequisites
73+
74+
### Required Tools
75+
76+
- **kubectl**: Kubernetes command-line tool
77+
- **helm**: Helm 3.x
78+
- **Kubernetes cluster**: Minikube, KIND, or any Kubernetes cluster
79+
80+
### Cluster Requirements
81+
82+
- Cluster must have at least 1 node
83+
- User must have permissions to:
84+
- Deploy DaemonSets and ServiceAccounts
85+
- Label nodes
86+
- Taint nodes
87+
- Create/delete Helm releases
88+
89+
### Test Environment Setup
90+
91+
The tests are designed to run on local Kubernetes clusters (Minikube, KIND, k3d). Example setup with k3d:
92+
93+
```bash
94+
# Install k3d (lightweight Kubernetes)
95+
brew install k3d # macOS
96+
# OR
97+
# curl -s https://raw.githubusercontent.com/k3d-io/k3d/main/install.sh | bash
98+
99+
# Create a local cluster
100+
k3d cluster create demo
101+
102+
# Verify cluster is running
103+
kubectl get nodes
104+
105+
# Build lightweight test image
106+
docker build -t e2e/newrelic-fluentbit-output:test -f Dockerfile.test .
107+
108+
# Load image into k3d cluster
109+
k3d image import e2e/newrelic-fluentbit-output:test -c demo
110+
```
111+
112+
**Note**: The test image can be any lightweight image (busybox, alpine) since tests only validate pod specifications, not runtime behavior. The chart is configured with `image.pullPolicy=Never` to use locally loaded images.
113+
114+
## Running the Tests
115+
116+
### Run All Tests
117+
118+
```bash
119+
cd charts/newrelic-logging/tests/functional-e2e
120+
./test-global-values.sh
121+
```
122+
123+
### Test Output
124+
125+
The test script provides colored output:
126+
- 🟢 **[INFO]**: General information
127+
- 🟡 **[WARN]**: Warnings (non-fatal)
128+
- 🔴 **[ERROR]**: Errors (fatal)
129+
-**PASS**: Test passed
130+
-**FAIL**: Test failed
131+
132+
Example output:
133+
134+
```
135+
[INFO] Starting functional E2E tests for global value inheritance
136+
[INFO] Using kubectl context: minikube
137+
[INFO] Test 1: Proxy configuration propagation
138+
[INFO] Installing chart with global.proxy...
139+
[INFO] Waiting for DaemonSet to be ready (timeout: 120s)...
140+
[INFO] DaemonSet ready: 1/1 pods
141+
[INFO] ✓ PASS: HTTP_PROXY environment variable set correctly
142+
[INFO] ✓ PASS: HTTPS_PROXY environment variable set correctly
143+
...
144+
========================================
145+
Test Summary
146+
========================================
147+
Tests run: 8
148+
Tests passed: 8
149+
Tests failed: 0
150+
========================================
151+
[INFO] All tests passed!
152+
```
153+
154+
### Test Duration
155+
156+
- **Per test**: ~30-60 seconds (Helm install + DaemonSet ready wait)
157+
- **Total suite**: ~5-8 minutes
158+
159+
## Test Details
160+
161+
### Test 1: Proxy Configuration Propagation
162+
163+
**What it tests**: `global.proxy``HTTP_PROXY`/`HTTPS_PROXY` environment variables
164+
165+
**Steps**:
166+
1. Install chart with `global.proxy: "http://test-proxy.example.com:3128"`
167+
2. Wait for DaemonSet to be ready
168+
3. Inspect pod spec for HTTP_PROXY and HTTPS_PROXY env vars
169+
4. Validate values match expected proxy URL
170+
171+
**Why it matters**: Corporate environments require proxy configuration for outbound connections.
172+
173+
---
174+
175+
### Test 2: Proxy Override
176+
177+
**What it tests**: Local `proxy` value takes precedence over `global.proxy`
178+
179+
**Steps**:
180+
1. Install chart with both `global.proxy` and `proxy` set
181+
2. Verify HTTP_PROXY uses local `proxy` value (not global)
182+
183+
**Why it matters**: Validates precedence model (local > global > default).
184+
185+
---
186+
187+
### Test 3: NodeSelector Propagation
188+
189+
**What it tests**: `global.nodeSelector` applies to DaemonSet pods
190+
191+
**Steps**:
192+
1. Label cluster node with `test-label=true`
193+
2. Install chart with `global.nodeSelector.test-label: "true"`
194+
3. Verify pod has nodeSelector field with correct label
195+
4. Cleanup: Remove node label
196+
197+
**Why it matters**: Enables node targeting for dedicated monitoring nodes.
198+
199+
---
200+
201+
### Test 4: Tolerations Propagation
202+
203+
**What it tests**: `global.tolerations` allows pods to tolerate node taints
204+
205+
**Steps**:
206+
1. Taint cluster node with `test-taint=true:NoSchedule`
207+
2. Install chart with `global.tolerations` matching taint
208+
3. Verify pod schedules despite taint (has toleration in spec)
209+
4. Cleanup: Remove node taint
210+
211+
**Why it matters**: Enables deployment on tainted nodes (e.g., monitoring-dedicated nodes).
212+
213+
---
214+
215+
### Test 5: Custom Registry Propagation
216+
217+
**What it tests**: `global.images.registry` changes container image paths
218+
219+
**Steps**:
220+
1. Install chart with `global.images.registry: "custom-registry.example.com"`
221+
2. Inspect pod spec for container image path
222+
3. Verify image path includes custom registry
223+
224+
**Why it matters**: Critical for air-gapped environments and private registries.
225+
226+
---
227+
228+
### Test 6: ServiceAccount Annotations Propagation
229+
230+
**What it tests**: `global.serviceAccount.annotations` propagate to ServiceAccount
231+
232+
**Steps**:
233+
1. Install chart with `global.serviceAccount.annotations.eks.amazonaws.com/role-arn`
234+
2. Inspect ServiceAccount for annotation
235+
3. Verify annotation value matches expected
236+
237+
**Why it matters**: Required for IAM roles (AWS IRSA, GCP Workload Identity, Azure Pod Identity).
238+
239+
---
240+
241+
### Test 7: VerboseLog Propagation
242+
243+
**What it tests**: `global.verboseLog: true``LOG_LEVEL=debug`
244+
245+
**Steps**:
246+
1. Install chart with `global.verboseLog: true`
247+
2. Inspect pod spec for LOG_LEVEL env var
248+
3. Verify value is "debug" (not default "info")
249+
250+
**Why it matters**: Enables debug logging for troubleshooting.
251+
252+
---
253+
254+
### Test 8: HostNetwork Propagation
255+
256+
**What it tests**: `global.hostNetwork: true` enables host network mode
257+
258+
**Steps**:
259+
1. Install chart with `global.hostNetwork: true`
260+
2. Inspect pod spec for hostNetwork field
261+
3. Verify field is true
262+
263+
**Why it matters**: Required for certain networking configurations (e.g., UDP log forwarding).
264+
265+
---
266+
267+
## Troubleshooting
268+
269+
### Test Failures
270+
271+
#### "DaemonSet did not become ready within 120s"
272+
273+
**Cause**: Image not available, insufficient resources, or scheduling constraints preventing pod from running.
274+
275+
**Solutions**:
276+
1. Verify image is loaded into Minikube: `minikube image ls | grep newrelic-fluentbit`
277+
2. Check pod events: `kubectl describe pod <pod-name>`
278+
3. Check pod logs: `kubectl logs <pod-name>`
279+
4. Verify cluster has resources: `kubectl top nodes`
280+
281+
#### "NodeSelector test failing"
282+
283+
**Cause**: Node label not applied or multiple nodes present.
284+
285+
**Solutions**:
286+
1. Verify node has label: `kubectl get nodes --show-labels`
287+
2. If multiple nodes, ensure label is on all nodes or adjust test
288+
289+
#### "Tolerations test failing"
290+
291+
**Cause**: Taint not applied or pod doesn't have toleration.
292+
293+
**Solutions**:
294+
1. Verify node has taint: `kubectl describe node <node-name> | grep Taints`
295+
2. Check pod tolerations: `kubectl get pod <pod-name> -o yaml | grep -A 10 tolerations`
296+
297+
### Cleanup Issues
298+
299+
If tests fail midway, you may need to manually clean up:
300+
301+
```bash
302+
# Delete Helm release
303+
helm delete nr-logging-e2e --namespace default
304+
305+
# Remove node labels
306+
kubectl label node --all test-label-
307+
308+
# Remove node taints
309+
kubectl taint node --all test-taint-
310+
311+
# Delete test pods
312+
kubectl delete pod test-pod-for-exec --namespace default
313+
```
314+
315+
## Integration with CI/CD
316+
317+
These tests are designed to run in GitHub Actions workflows. See `.github/workflows/functional-e2e.yaml` for the workflow configuration.
318+
319+
### Workflow Triggers
320+
321+
- **On PR**: Run against all supported Kubernetes versions
322+
- **On Push to main**: Run regression tests
323+
- **Manual**: Via workflow_dispatch
324+
325+
### Kubernetes Version Matrix
326+
327+
Tests run against:
328+
- v1.34.0
329+
- v1.33.0
330+
- v1.32.0
331+
- v1.31.0
332+
- v1.30.0
333+
334+
## Limitations
335+
336+
### What These Tests Don't Cover
337+
338+
1. **Telemetry Validation**: Tests don't verify that logs reach New Relic (use `newrelic-integration-e2e-action` for that)
339+
2. **Fluent Bit Functionality**: Tests don't verify log parsing, filtering, or forwarding logic
340+
3. **Multi-Node Scheduling**: Tests assume single-node cluster (Minikube)
341+
4. **Windows DaemonSet**: Tests only cover Linux DaemonSet (Windows requires Windows node)
342+
5. **Performance**: No load testing or resource usage validation
343+
344+
### Known Issues
345+
346+
- **Image Loading**: Tests require pre-built image loaded into Minikube (not pulled from registry)
347+
- **Single Node**: Tests may not catch multi-node scheduling issues
348+
- **No Proxy Server**: Tests validate proxy env vars but don't test actual proxy connectivity
349+
350+
## Future Enhancements
351+
352+
1. **Multi-Chart Tests**: Validate global values across multiple charts in nri-bundle
353+
2. **Real Proxy Test**: Deploy Squid proxy in Minikube and validate connectivity
354+
3. **Air-Gapped Registry Test**: Deploy local registry and validate image pull
355+
4. **Windows Support**: Add Windows node to Minikube and test Windows DaemonSet
356+
5. **Fargate Exclusion Test**: Validate that affinity rules exclude Fargate nodes
357+
6. **Priority Class Test**: Validate `global.priorityClassName` propagation
358+
7. **Affinity Test**: Validate `global.affinity` propagation
359+
360+
## Related Documentation
361+
362+
- [Helm Unit Tests](../README.md): Template-level tests using helm-unittest
363+
- [E2E Telemetry Tests](../../../../e2e/README.md): Telemetry validation tests (if available)
364+
- [Global Values Specification](../../../../CLAUDE.md): Complete list of 27 global values
365+
366+
## Support
367+
368+
For issues or questions:
369+
- **GitHub Issues**: [newrelic/helm-charts/issues](https://github.com/newrelic/helm-charts/issues)
370+
- **Pull Requests**: Contributions welcome!
371+
- **Documentation**: [New Relic Kubernetes Integration Docs](https://docs.newrelic.com/docs/kubernetes-pixie/kubernetes-integration/get-started/introduction-kubernetes-integration/)

0 commit comments

Comments
 (0)