Skip to content

Proposal: allow instrumentations to record individual multimodal data #4097

@Cirilla-zmh

Description

@Cirilla-zmh

This proposal is inspired by the design of the completion_hook in the current opentelemetry-util-genai package. It allows for the entire input/output messages to be uploaded to external storage when an LLM span ends, while saving a reference to that storage within the span attributes. Effectively, this gives us three options for persisting messages: recording them as span attributes, recording them as span events, or packaging and uploading them to external storage. This is a highly flexible and valuable design.

While OpenTelemetry provides a broad range of implementation possibilities, I believe most vendors will only recommend one or two of these approaches—and we are no exception. Our users consistently prefer using span attributes to record all messages, as it is more intuitive and convenient for immediate viewing.

However, when multimodal data is included in these messages, its display and consumption become problematic. If the data is raw, it is typically encoded as Base64 strings embedded directly within the json of messages. This bulky Base64 data is difficult to store and even harder to consume (for instance, for visualization or building vector indices).

Therefore, we propose adding an optional mode to opentelemetry-util-genai that allows for reporting multimodal data independently. In this mode, all raw multimodal data would be uploaded to a designated file system and referenced via a URI path, as illustrated in the bottom half of the image below:

Image

There're a prototype implementation in our distro repo, which is used by many customers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions