-
Notifications
You must be signed in to change notification settings - Fork 847
Description
This proposal is inspired by the design of the completion_hook in the current opentelemetry-util-genai package. It allows for the entire input/output messages to be uploaded to external storage when an LLM span ends, while saving a reference to that storage within the span attributes. Effectively, this gives us three options for persisting messages: recording them as span attributes, recording them as span events, or packaging and uploading them to external storage. This is a highly flexible and valuable design.
While OpenTelemetry provides a broad range of implementation possibilities, I believe most vendors will only recommend one or two of these approaches—and we are no exception. Our users consistently prefer using span attributes to record all messages, as it is more intuitive and convenient for immediate viewing.
However, when multimodal data is included in these messages, its display and consumption become problematic. If the data is raw, it is typically encoded as Base64 strings embedded directly within the json of messages. This bulky Base64 data is difficult to store and even harder to consume (for instance, for visualization or building vector indices).
Therefore, we propose adding an optional mode to opentelemetry-util-genai that allows for reporting multimodal data independently. In this mode, all raw multimodal data would be uploaded to a designated file system and referenced via a URI path, as illustrated in the bottom half of the image below:
There're a prototype implementation in our distro repo, which is used by many customers.