OpenTelemetry Collector Memory Issues #7690
-
I have opentelemery collector setup to scrape 4 endpoints
I have noticed that the data coming into SkyWalking has been really slow, so I started to investigate and what I found is that the otel collector pod keeps crashing with status OOMKilled status, which is an out of memory issue. So I figured either there is too much for it to collect or it needs more memory Looking into SkyWalking k8s monitoring I cant see this as a problem there are all so I tried a couple of things
Checking the node resources using k top nodes at the time the pods are crashing its at approx 60% so it has not over shot its memory on the actual machine Question: Have you seen this issue before and/or any advice on how to handle this? Do I have the collector setup with too many jobs and maybe should separate them? Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 7 replies
-
I think you should raise this question to OpenTelemetry community. We don't use any specific feature or change any OpenTelemetry codes. So, I would guess this is some kind of bugs of the version you are using. |
Beta Was this translation helpful? Give feedback.
-
I think OAP doesn't use
Not sure about |
Beta Was this translation helpful? Give feedback.
I think OAP doesn't use
OTEL
to receive theenvoy-stats
, it usesgRPC
directly, andenvoy-stats
has too many metrics, you should filter them otherwise the collector is very easy to crash.eg in doc: https://skywalking.apache.org/docs/main/latest/en/setup/envoy/metrics_service_setting/#send-envoy-metrics-to-skywalking-with--without-istio: