[8.18](backport #46145) [azure monitor] Address wildcard metrics names timegrain issue #46606
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed commit message
Main Bug Addressed: Wildcard Search Bug
Issue: #43885
Description: In the buggy scenario, a wildcard search for metrics is provided without a timegrain. A timegrain is the Azure terminology for aggregation period. This caused metrics to be pulled with an incompatible timegrain. In the buggy scenario, we incorrectly use the last leveraged timegrain to pull metric data again.
Fix: In this fix, we first grab the smallest available timegrain from the metric availabilities from the Azure API. These timegrains appear to be ordered, ascending, so we use the first one to assign the metric to a group. We then have groups of compatible metrics associated with this timegrain to prepare for the next step. This fix applies to both
Minor Side Bug Addressed: Nil Pointer Dereference
Issue: #43725
Description: In line
beats/x-pack/metricbeat/module/azure/monitor_service.go
Line 322 in 67e847c
we are dereferencing the resp.Interval pointer to get the interval from the api response.
Fix: Check if the interval is not nil and not empty before continuing to process this data. If it is nil or empty, reject the data as we do when the API call errors. When this happens, we can assume the data is bad. This is because we have also handled the wildcard issue in this PR, so this API error edge case in code should not be hit unless the API is returning bad data.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.Disruptive User Impact
None - this only fixes bugs
Author's Checklist
How to test this PR locally
Testing wildcard metrics bug
Unit tests have also been added by this PR, which are automatically run by CI. To test beyond this manually, see below:
Set up the scenario/infrastructure as described in the parent issue: #43885
To test/verify that the 400 error is gone:
Check out this branch and set a breakpoint here before running the scenario in the debugger. Observe that
To test/verify that the number of metric definitions is unaffected:
TestMapMetric
, so the below helper log is just an extra piece of verificationmain
and see that the number of metric definitions is 73 in both cases. Therefore, the number of metric definitions is unaffected.unique metric definition count
.Testing minor nil pointer bug
To confirm that we handle the situation with a nil pointer, one can set up a debugger and set a breakpoint, then force the

resp.Interval
to nil at the first breakpoint in the screenshot. However, as noted in the comments in this code, this should not happen because we have handled the wildcard timegrain config scenario.Related issues
metrics.name: [*]
) issues #43885This is an automatic backport of pull request #46145 done by [Mergify](https://mergify.com).