-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Collector memory increases by about ~20 MB after v0.125.0 release #13014
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
#13015 only fixes the case where OTLP is not used. |
…13015) <!--Ex. Fixing a bug - Describe the bug and how this fixes the issue. Ex. Adding a feature - Explain what this achieves.--> #### Description Avoid re-creating sampler counters every time we wrap with attributes. <!-- Issue number if applicable --> #### Link to tracking issue Updates #13014 <!--Describe what testing was performed and which tests were added.--> #### Testing <!--Describe the documentation added.--> #### Documentation <!--Please delete paragraphs that you did not use before submitting.--> --------- Signed-off-by: Bogdan Drutu <[email protected]> Co-authored-by: Jade Guiton <[email protected]>
I've filed an issue in the Zap repo to ask if it would be possible to share a sampler core between multiple Zap pipelines (one for each OTel Logger we create), which would be one way to eliminate this issue: uber-go/zap#1498 |
This PR contains the following updates: | Package | Type | Update | Change | |---|---|---|---| | [go.opentelemetry.io/collector/component](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/component/componenttest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.125.0` -> `v0.126.0` | | [go.opentelemetry.io/collector/confmap](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/consumer](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/consumer/consumertest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.125.0` -> `v0.126.0` | | [go.opentelemetry.io/collector/pdata](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/processor](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v1.31.0` -> `v1.32.0` | | [go.opentelemetry.io/collector/processor/processortest](https://github.com/open-telemetry/opentelemetry-collector) | require | minor | `v0.125.0` -> `v0.126.0` | --- ### Release Notes <details> <summary>open-telemetry/opentelemetry-collector (go.opentelemetry.io/collector/component)</summary> ### [`v1.32.0`](https://github.com/open-telemetry/opentelemetry-collector/blob/HEAD/CHANGELOG.md#v1320v01260) ##### 🛑 Breaking changes 🛑 - `configauth`: Removes deprecated `configauth.Authentication` and `extensionauthtest.NewErrorClient` ([#​12992](open-telemetry/opentelemetry-collector#12992)) The following have been removed: - `configauth.Authentication` use `configauth.Config` instead - `extensionauthtest.NewErrorClient` use `extensionauthtest.NewErr` instead ##### 💡 Enhancements 💡 - `service`: Replace `go.opentelemetry.io/collector/semconv` usage with `go.opentelemetry.io/otel/semconv` ([#​12991](open-telemetry/opentelemetry-collector#12991)) - `confmap`: Update the behavior of the confmap.enableMergeAppendOption feature gate to merge only component lists. ([#​12926](open-telemetry/opentelemetry-collector#12926)) - `service`: Add item count metrics defined in Pipeline Component Telemetry RFC ([#​12812](open-telemetry/opentelemetry-collector#12812)) See [Pipeline Component Telemetry RFC](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/rfcs/component-universal-telemetry.md) for more details: - `otelcol.receiver.produced.items` - `otelcol.processor.consumed.items` - `otelcol.processor.produced.items` - `otelcol.connector.consumed.items` - `otelcol.connector.produced.items` - `otelcol.exporter.consumed.items` - `tls`: Add trusted platform module (TPM) support to TLS authentication. ([#​12801](open-telemetry/opentelemetry-collector#12801)) Now the TLS allows the use of TPM for loading private keys (e.g. in TSS2 format). ##### 🧰 Bug fixes 🧰 - `exporterhelper`: Add validation error for batch config if min_size is greater than queue_size. ([#​12948](open-telemetry/opentelemetry-collector#12948)) - `telemetry`: Allocate less memory per component when OTLP exporting of logs is disabled ([#​13014](open-telemetry/opentelemetry-collector#13014)) - `confmap`: Use reflect.DeepEqual to avoid panic when confmap.enableMergeAppendOption feature gate is enabled. ([#​12932](open-telemetry/opentelemetry-collector#12932)) - `internal telemetry`: Add resource attributes from telemetry.resource to the logger ([#​12582](open-telemetry/opentelemetry-collector#12582)) Resource attributes from telemetry.resource were not added to the internal console logs. Now, they are added to the logger as part of the "resource" field. - `confighttp and configcompression`: Fix handling of `snappy` content-encoding in a backwards-compatible way ([#​10584](open-telemetry/opentelemetry-collector#10584), [#​12825](open-telemetry/opentelemetry-collector#12825)) The collector used the Snappy compression type of "framed" to handle the HTTP content-encoding "snappy". However, this encoding is typically used to indicate the "block" compression variant of "snappy". This change allows the collector to: - When receiving a request with encoding 'snappy', the server endpoints will peek at the first bytes of the payload to determine if it is "framed" or "block" snappy, and will decompress accordingly. This is a backwards-compatible change. If the feature-gate "confighttp.framedSnappy" is enabled, you'll see new behavior for both client and server: - Client compression type "snappy" will now compress to the "block" variant of snappy instead of "framed". Client compression type "x-snappy-framed" will now compress to the "framed" variant of snappy. - Servers will accept both "snappy" and "x-snappy-framed" as valid content-encodings. - `tlsconfig`: Disable TPM tests on MacOS/Darwin ([#​12964](open-telemetry/opentelemetry-collector#12964)) <!-- previous-version --> </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox. 👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://github.com/renovatebot/renovate/discussions) if that's undesired. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOS4yNjMuMSIsInVwZGF0ZWRJblZlciI6IjM5LjI2My4xIiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6W119--> Reviewed-on: https://gitea.t000-n.de/t.behrendt/tracebasedlogsampler/pulls/13 Co-authored-by: Renovate Bot <[email protected]> Co-committed-by: Renovate Bot <[email protected]>
There has been no response to the issue on the Zap repo so far, but I had an idea on how to share sampling counters using |
#### Context PR #12617, which implemented the injection of component-identifying attributes into the `zap.Logger` provided to components, introduced significant additional memory use when the Collector's pipelines contain many components (#13014). This was because we would call `zapcore.NewSamplerWithOptions` to wrap the specialized logger core of each Collector component, which allocates half a megabyte's worth of sampling counters. This problem was mitigated in #13015 by moving the sampling layer to a different location in the logger core hierarchy. This meant that Collector users that do not export their logs through OTLP and only use stdout-based logs no longer saw the memory increase. #### Description This PR aims to provide a better solution to this issue, by using the `reflect` library to clone zap's sampler core and set a new inner core, while reusing the counter allocation. (This may also be "more correct" from a sampling point of view, ie. we only have one global instance of the counters instead of one for console logs and one for each component's OTLP-exported logs, but I'm not sure if anyone noticed the difference anyway). #### Link to tracking issue Fixes #13014 #### Testing A new test was added which checks that the log counters are shared between two sampler cores with different attributes.
Uh oh!
There was an error while loading. Please reload this page.
Component(s)
service
What happened?
Describe the bug
We noticed that the collector memory increases by about ~20 MB after the latest release. Here's a graph (plotted using
otelcol_process_memory_rss
metric) showing the memory of the collector running an older version (v0.120.0) and the newer version:Steps to reproduce
Upgrade the collector to the latest version
What did you expect to see?
Very minimal increase in memory used by the collector.
What did you see instead?
The memory increased by about ~20 MB.
Collector version
v0.125.0
Environment information
Environment
OS: Centos7
Compiler(if manually compiled): go 1.23.7
Additional context
We collected heap profiles from both old and new versions of the collector. Here's the heap profile from old version:
Here's the heap profile from the new version of the collector:
It looks like the memory increase is because of a new usage of zap somewhere (
go.uber.org/zap/zapcore.newCounters
). Drilling down using the flamegraph shows the call stack:The call to the
zapcore.NewSamplerWithOptions
seems to be added here: #12617. The ~20 MB increase seems excessive just to maintains some logging related counters.The text was updated successfully, but these errors were encountered: