Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] adding otlp endpoint #7996

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

nicolastakashi
Copy link
Contributor

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification


tenant, err := tenancy.GetTenantFromHTTP(r, h.options.TenantHeader, h.options.DefaultTenantID, h.options.TenantField)
if err != nil {
level.Error(h.logger).Log("msg", "error getting tenant from HTTP", "err", err)

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

Sensitive data returned by HTTP request headers
flows to a logging call.

defer writeGate.Done()
if err != nil {
level.Error(tLogger).Log("err", err, "msg", "internal server error")

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

Sensitive data returned by HTTP request headers
flows to a logging call.
case errBadReplica:
responseStatusCode = http.StatusBadRequest
default:
level.Error(tLogger).Log("err", err, "msg", "internal server error")

Check failure

Code scanning / CodeQL

Clear-text logging of sensitive information High

Sensitive data returned by HTTP request headers
flows to a logging call.
Copy link
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the efforts here! Looks good already. 🙂

I did recommend copying the translator struct over to use thanos protos natively instead of doing a full conversion, but it could be a maintenance burden.

Let's wait for other maintainers to have a look as well.

Couple of comments

pkg/receive/handler_otlp.go Show resolved Hide resolved
pkg/receive/otlptranslator/metrics_to_prw.go Outdated Show resolved Hide resolved
Copy link
Collaborator

@matej-g matej-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this work @nicolastakashi ❤️, I'm adding a couple more suggestions

pkg/receive/handler_otlp.go Outdated Show resolved Hide resolved
pkg/receive/handler_otlp.go Show resolved Hide resolved
pkg/receive/handler_otlp.go Show resolved Hide resolved
// Copyright (c) The Thanos Authors.
// Licensed under the Apache License 2.0.

// Provenance-includes-location: https://github.com/prometheus/prometheus/blob/318d6bc4bfe2d81af88697940740a378b503335a/storage/remote/otlptranslator/prometheusremotewrite/context.go#L18
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my understanding, where are these provenance directives from? What does it mean? That the code is copied from Prometheus repo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copy the code from prometheus and prometheus copied from OTel 😅

test/e2e/e2ethanos/services.go Outdated Show resolved Hide resolved
Copy link
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! ❤️
This already looks quite good. @nicolastakashi we can probably mark it ready for review, and get others to have a look as well?

cmd/thanos/receive.go Outdated Show resolved Hide resolved
pkg/receive/handler.go Outdated Show resolved Hide resolved
@@ -275,6 +277,18 @@ func NewHandler(logger log.Logger, o *Options) *Handler {
),
)

h.router.Post(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saswatamcode just FYI use this endpoint naming will require users to use the metrics_endpoint instead of endpoint property, maybe worth document it or writing a blog post later about Thanos and OTLP wdyt?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean on otel-collector? I think that would be fine, we can document

@nicolastakashi nicolastakashi marked this pull request as ready for review December 20, 2024 14:06
Copy link
Contributor

@pedro-stanaka pedro-stanaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Thanks for working on this, this makes the project much more relevant in the whole Observability space.

That said, I am not a huge fan of the whole copying files over, but I am okay with it in name of optimization. The code for the handler looks good, I just wanted to see more core reuse in general (but not a problem to solve on this PR). We already have duplication because of Capn Proto ingestion and now also because of OTLP.

Once again, amazing work Nicolas! 💪

Copy link
Collaborator

@matej-g matej-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @nicolastakashi 🙌 , this is looking better with every pass.

I'm adding couple more suggestions to improve the parameters, I hope that's OK and doesn't add to much effort.

Two more things:

  1. I think it would be good to have a separate documentation section to explain the OTLP endpoint and mainly to explain the behavior of resource attributes and how the parameters affect the ingested metrics / labels.

  2. With regards to the translator code, I'm personally also leaning towards just importing instead of copying and all of the code, and then seeing if we need to copy the code over for further optimization. But it's not a strong position, if people see value in including this optimization from the get-go, I'm not opposed.

cmd/thanos/receive.go Outdated Show resolved Hide resolved
cmd/thanos/receive.go Outdated Show resolved Hide resolved
@nicolastakashi nicolastakashi requested a review from matej-g January 6, 2025 16:06
Copy link
Collaborator

@matej-g matej-g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple more additions to the previous changes related to the parameters, otherwise looks good on my side!

cmd/thanos/receive.go Outdated Show resolved Hide resolved
pkg/receive/handler_otlp.go Outdated Show resolved Hide resolved
saswatamcode
saswatamcode previously approved these changes Jan 8, 2025
Copy link
Member

@saswatamcode saswatamcode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from pending comments + conflict, I think it looks good now!

I think there few things to note/remember as TODOs,

  1. otlptranslator duplication to use zlabels. I think having it here works, but would cause some maintenance burden overall, we can choose to remove later if that were to be the case (or even upstream zlabels)
  2. We are also duplicating a lot of handler code. I had to kind of do the same initially in Support remote write 2.0 on receive #8033 for remote write 2.0. But I think once this is merged, I'll update my PR to make things around handling/replication more generic for rw1, 2.0, otlp and capnproto!
  3. Having a blog post + more docs about this feature once it lands, maybe even with some reference arch :)

}
}

req, err := remote.DecodeOTLPWriteRequest(r)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be called lots of times - https://github.com/prometheus/prometheus/blob/1ea9b72997a116b4a7f7a22635c16d71dc5ab440/storage/remote/codec.go#L873 maybe this function could accept a []byte so that we could pool this buffer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, makes sense to me.
Do you think we should change upstream before we merge it?
Do you prefer copy this function and maintain it on thanos also?

Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
nicolastakashi and others added 21 commits January 9, 2025 21:02
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Co-authored-by: Saswata Mukherjee <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Co-authored-by: Matej Gera <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Co-authored-by: Matej Gera <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
Signed-off-by: Nicolas Takashi <[email protected]>
@nicolastakashi nicolastakashi force-pushed the chore/adding-otlp-endpoint branch from 14179f7 to eda86c3 Compare January 9, 2025 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants