Skip to content

[consumererror] Add OTLP-centric error type #13042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 27 additions & 0 deletions .chloggen/consumererror-otlp-type.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. otlpreceiver)
component: consumererror

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Add `Error` type

# One or more tracking issues or pull requests related to the change
issues: [7047]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext: |
This type can contain information about errors that allow components (e.g. exporters)
to communicate error information back up the pipeline.

# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [api]
131 changes: 131 additions & 0 deletions consumer/consumererror/error.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

package consumererror // import "go.opentelemetry.io/collector/consumer/consumererror"

import (
"net/http"

"google.golang.org/grpc/codes"
"google.golang.org/grpc/status"

"go.opentelemetry.io/collector/consumer/consumererror/internal/statusconversion"
)

// Error is intended to be used to encapsulate various information that can add
// context to an error that occurred within a pipeline component. Error objects
// are constructed through calling `New` with the relevant options to capture
// data around the error that occurred.
//
// It may hold multiple errors from downstream components, and can be merged
// with other errors as it travels upstream using `Combine`. The `Error` should
// be obtained from a given `error` object using `errors.As`.
//
// Experimental: This API is at the early stage of development and may change
// without backward compatibility
type Error struct {
error
httpStatus int
grpcStatus *status.Status
retryable bool
}

var _ error = (*Error)(nil)

// NewOTLPHTTPError records an HTTP status code that was received from a server
// during data submission.
func NewOTLPHTTPError(origErr error, httpStatus int) error {
return &Error{error: origErr, httpStatus: httpStatus}
}

// NewOTLPGRPCError records a gRPC status code that was received from a server
// during data submission.
func NewOTLPGRPCError(origErr error, status *status.Status) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an "original" error in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is, you can see an example from the OTLP exporter here.

I'd like to continue to wrap downstream errors even if calls to NewOTLPGRPCError will generally be made in exporters. We could call status.FromError in this function, but then we end up in the awkward case where the code in the *status.Status object is okay and there's no easy way in the exporter to tell it wasn't an error. As a result I think the best we can do is just accept both, even if it makes the function signature slightly awkward.

There's some symmetry with the function signature for NewOTLPHTTPError, which I think makes up for it a bit.

return &Error{error: origErr, grpcStatus: status}
}

// NewRetryableError records that this error is retryable according to OTLP specification.
func NewRetryableError(origErr error) error {
return &Error{error: origErr, retryable: true}
}

// Error implements the error interface.
func (e *Error) Error() string {
return e.error.Error()
}

// Unwrap returns the wrapped error for use by `errors.Is` and `errors.As`.
func (e *Error) Unwrap() error {
return e.error
}

// OTLPHTTPStatus returns an HTTP status code either directly set by the source,
// derived from a gRPC status code set by the source, or derived from Retryable.
// When deriving the value, the OTLP specification is used to map to HTTP.
// See https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md for more details.
//
// If a http status code cannot be derived from these three sources then 500 is returned.
//
// Experimental: This API is at the early stage of development and may change
// without backward compatibility
func (e *Error) OTLPHTTPStatus() int {
if e.httpStatus != 0 {
return e.httpStatus
}
if e.grpcStatus != nil {
return statusconversion.GetHTTPStatusCodeFromStatus(e.grpcStatus)
}
if e.retryable {
return http.StatusServiceUnavailable
}
return http.StatusInternalServerError
}

// OTLPGRPCStatus returns an gRPC status code either directly set by the source,
// derived from an HTTP status code set by the source, or derived from Retryable.
// When deriving the value, the OTLP specification is used to map to GRPC.
// See https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md for more details.
//
// If a grpc code cannot be derived from these three sources then INTERNAL is returned.
//
// Experimental: This API is at the early stage of development and may change
// without backward compatibility
func (e *Error) OTLPGRPCStatus() *status.Status {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would actually change the signature to something like:

func ToGRPCStatus(e error) *status.Status;

This way:

  • Can handle the "error.As" part.
  • Can handle extra details like retry-after you proposed with the constructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. Receivers will find a function like this very ergonomic if they're only concerned about transmitting the original status code, and we can always duplicate the functionality in a method later. Added.

if e.grpcStatus != nil {
return e.grpcStatus
}
if e.httpStatus != 0 {
return statusconversion.NewStatusFromMsgAndHTTPCode(e.Error(), e.httpStatus)
}
if e.retryable {
return status.New(codes.Unavailable, e.Error())
}
return status.New(codes.Unknown, e.Error())
}

// Retryable returns true if the error was created with the WithRetryable set to true,
// if the http status code is retryable according to OTLP,
// or if the grpc status is retryable according to OTLP.
// Otherwise, returns false.
//
// See https://github.com/open-telemetry/opentelemetry-proto/blob/main/docs/specification.md for retryable
// http and grpc codes.
//
// Experimental: This API is at the early stage of development and may change
// without backward compatibility
func (e *Error) Retryable() bool {
if e.retryable {
return true
}
switch e.httpStatus {
case http.StatusTooManyRequests, http.StatusBadGateway, http.StatusServiceUnavailable, http.StatusGatewayTimeout:
return true
}
if e.grpcStatus != nil {
switch e.grpcStatus.Code() {
case codes.Canceled, codes.DeadlineExceeded, codes.Aborted, codes.OutOfRange, codes.Unavailable, codes.DataLoss:
return true
}
}
return false
}
Loading
Loading