Skip to content

OpenTelemetry Support in the OpenSearch Ecosystem #223

@vamsimanohar

Description

@vamsimanohar

1. Background

OpenTelemetry (OTel) is an open standard for instrumenting, collecting, and exporting telemetry data - traces, metrics, and logs from applications and systems. OTel provides vendor-neutral telemetry generation across languages and platforms, with standardized data delivery to backend systems. OTel's scope is limited to data creation and delivery—it doesn't manage observability backends, storage systems, or data analysis.

1.1 Key Components of the OTel Specification

  • APIs & SDKs: Language-specific implementations providing standardized ways to instrument code with runtime logic for collecting, processing, and exporting telemetry data.

  • Instrumentation Libraries: Pre-built, language-specific libraries that automatically add telemetry to popular frameworks and tools, reducing manual instrumentation requirements.

  • Semantic Conventions: Standard naming rules ensuring consistent terminology across telemetry data (e.g., "http.status_code" for web response codes), making traces, metrics, and logs uniform across applications and tools.

  • Collector: A configurable component that receives, processes, and exports telemetry data. Deployable as an agent or gateway, it supports customization through processors for filtering, transforming, or routing data.

  • OTLP (OpenTelemetry Protocol): A transport protocol defining the encoding and delivery mechanism for telemetry data between sources, intermediate nodes, and backends.

1.2 OpenTelemetry Schema

Telemetry sources and consumers often depend on specific data structures, creating challenges when evolving telemetry data without breaking compatibility with existing consumers. OpenTelemetry addresses this through telemetry schemas, which define necessary transformations to map data between different versions of semantic conventions. These schemas enable producers and consumers to evolve independently, allowing transformation of one version to another as shown in the example Telemetry Schema for V1.30.0. The transformation can be bidirectional higher to lower and vice versa.

1.3 Simple Schema in OpenSearch

The OpenSearch ecosystem has introduced a simple schema for observability for standardizing storage layer to analyze, visualize, and correlate data efficiently. This simple schema is currently based on the OTLP data model and OpenTelemetry semantic conventions. It includes:

We will discuss more on the future of simple schema in below sections.

2. Goals and Requirements for OTel Integration

2.1 Primary Goals

  • Standards Alignment: Fully embrace OpenTelemetry as the industry standard for observability data, ensuring OpenSearch remains interoperable with the broader observability ecosystem.

  • Schema Evolution Support: Provide mechanisms to handle OpenTelemetry schema changes without breaking existing visualizations or dashboards plugin.

  • Semantic-Aware Visualization: Leverage OpenTelemetry's semantic conventions to build rich, meaningful dashboards that understand the context and relationships of telemetry data fields.

  • Seamless Upgrade Path: Enable smooth transitions between OpenSearch versions with minimal disruption to observability workflows during schema evolution.

2.2 Key Requirements

  • Version Support Documentation: Each OpenSearch plugin that integrates with OpenTelemetry must clearly document:

    • Supported OTel schema versions
    • Migration paths between versions
    • Breaking changes
  • Transformation Flexibility: Support in-flight transformation of telemetry data between schema versions.

  • Custom Attributes: Allow opensearch users to extend standard storage mappings with domain-specific attributes while maintaining compatibility.

  • Balanced Migration: Support existing workloads while enabling adoption of newer schema versions.

3. Integration Approaches

3.1 Proposed Approach: Version-Aware OTel Integration

This approach uses OpenTelemetry semantic conventions throughout the OpenSearch Observability ecosystem, with ingest-time transformation handling schema evolution.

Image

Architecture Overview

The architecture leverages Data Prepper for transformation to ensure consistent target OpenTelemetry version before storage:

  1. Telemetry sources publish data via OTLP with potentially varying schema versions
  2. Data Prepper transforms the data to a target OTel version (e.g., 1.30.0)
  3. Transformed data is stored in OpenSearch with a consistent schema
  4. OpenSearch Dashboards queries the standardized data structure

Each OpenSearch Dashboards (OSD) version, observability plugin will publish a documented range of OTel versions it supports, based on the fields used in its visualizations and UI components. For example:

OSD Version Supported OTel Version Range
2.8.0 1.15.0 - 1.18.0
2.9.0 1.17.0 - 1.21.0
3.0.0 1.20.0 - 1.25.0

When customers upgrade their OpenSearch version, they must also update their Data Prepper configuration to target an OTel version within the supported range for the new OpenSearch version. This maintains compatibility between ingested data and dashboard expectations.

Version Publication and Compatibility Matrix:

  • Each OpenSearch release observability plugin will publish:
    • A minimum supported OTel version
    • A maximum supported OTel version
  • Documentation will include a compatibility matrix showing which OTel versions work with observability plugin for each OpenSearch version
  • Migration guides will outline required changes when moving between versions

Ingest-Time Transformation:

  • A schema transformation processor in Data Prepper will:
    • Convert data from source OTel version to target version
    • Support configuration via target_otel_version parameter
    • Handle data with and without explicit schema_url
    • Apply transformations based on published OpenTelemetry schema registry.

Schema Evolution Handling:

  • Field aliases will maintain backward compatibility
  • New indices can be created when major schema changes occur
  • Schema versioning metadata will be stored with each document

Dashboard Compatibility:

  • Dashboards will automatically adapt to schema versions within their supported range

3.2 Alternative Approaches

3.2.1 Canonical Internal Schema

Define a single, stable canonical schema in OpenSearch and map all incoming OTel data into it.

Advantages:

  • Provides consistent queries independent of OTel version changes
  • Simplifies dashboard development with a stable data model
  • Reduces need for field aliases or runtime mapping

Disadvantages:

  • Diverges from industry standard, creating a proprietary schema
  • Limits storage of raw OTel fields that may be valuable to users
  • Requires ongoing maintenance to incorporate new OTel conventions

This approach is not recommended as it creates a new standard with an internal schema, diverging from OTel standards.
Also, in the proposed approach the dashboards layer actually leverages stable fields from OpenTelemetry conventions that rarely change, ensuring automatic compliance with new OTel versions.

3.2.2 Raw Storage with Version-Aware Queries

Store OTel data in its raw form with the schema_url, handling version differences at query time.

Advantages:

  • Preserves complete fidelity of original telemetry data
  • Supports advanced users who need access to raw attributes
  • Eliminates ingest-time transformation overhead

Disadvantages:

  • Significantly complicates dashboard development with version-aware queries
  • Creates performance overhead from runtime transformations during queries
  • Increases storage requirements with redundant or deprecated fields
  • Makes cross-version correlation more complex

4. OpenTelemetry Storage Strategy

4.1 Design Principles

The OpenSearch storage strategy for OpenTelemetry data follows these key principles:

  • OTLP Alignment: Our storage model aligns with the OpenTelemetry Protocol (OTLP) transport model, utilizing its JSON representation with lowerCamelCase field naming conventions.

  • Semantic Convention Adherence: Field names under attributes, resourceAttributes, and instrumentationScopeAttributes adhere to OpenTelemetry's semantic conventions, following dot notation to maintain a clear namespace hierarchy.

  • Naming Standardization: Custom keys under attributes follow the naming conventions specified by OTel to ensure consistency.

  • Index Organization: We provide recommended index naming conventions for different telemetry signals to simplify data organization and lifecycle management.

This alignment ensures consistency between the OpenSearch storage model and the OTLP transport model, facilitating easy data interchange and reducing the need for complex transformations during data ingestion and retrieval.

4.2 Component Templates for telemetry indices.

To provide flexibility while maintaining standardization, we suggest to use composable index temaplates:

{
  "index_templates": [
    {
      "name": "otel-traces-template",
      "index_patterns": ["ss4o_traces-*-*-*"],
      "composed_of": [
        "otel-core-fields",       // Required fields for visualization
        "otel-http-attributes",    // Optional HTTP semantic conventions
        "otel-k8s-attributes",     // Optional Kubernetes conventions
        "customer-extensions"      // Customer-specific extensions
      ],
      "priority": 100
    }
  ]
}

This approach allows:

  • Core fields to be strictly enforced
  • Optional semantic convention categories to be selectively included
  • Custom attributes depending on user's requirements.

4.3 Tracing Core Fields Mapping

Field Path Type Properties Comments
traceId keyword
spanId keyword
parentSpanId keyword
traceState keyword
traceFlags integer
kind keyword
startTime date_nanos
endTime date_nanos
durationInNanos long
status.code integer
status.message text
events nested timestamp: date_nanos
name: text
attributes: object (dynamic)
droppedAttributesCount: integer
droppedEventsCount integer
resource.schemaUrl keyword
resource.attributes object dynamic: true
resource.droppedAttributesCount integer
instrumentationScope.name keyword
instrumentationScope.version keyword
instrumentationScope.schemaUrl keyword
instrumentationScope.attributes object dynamic: true
links nested traceId: keyword
spanId: keyword
traceState: keyword
attributes: object (dynamic)
droppedAttributesCount: integer
droppedLinksCount integer
attributes object dynamic: true
attributes.data_stream.dataset keyword
attributes.data_stream.namespace keyword
attributes.data_stream.type keyword
droppedAttributesCount integer

4.4 Logging Core Fields Mapping

Field Path Type Properties Comments
@timestamp date_nanos
observedTimestamp date_nanos
traceId keyword
spanId keyword
severity.text keyword
severity.number integer
body text
droppedAttributesCount integer
eventName text
resource.schemaUrl keyword
resource.attributes object dynamic: true
instrumentationScope.name keyword
instrumentationScope.version keyword
instrumentationScope.schemaUrl keyword
instrumentationScope.attributes object dynamic: true
attributes object dynamic: true
attributes.data_stream.dataset keyword
attributes.data_stream.namespace keyword
attributes.data_stream.type keyword

4.5 Handling Mapping Explosion

To prevent mapping explosion, we implement a multi-tier attribute approach:

{
  "mappings": {
    "properties": {
      "attributes": {
        "type": "object",
        "dynamic": true,
        "properties": {
          // Core attributes with proper mapping
          "http.method": { "type": "keyword" },
          "http.status_code": { "type": "integer" }
          // Additional mapped fields...
        }
      },
      "attributes_flat": {
        "type": "flattened",
        "depth_limit": 10,
        "ignore_above": 1024
      },
      "attributes_raw": {
        "type": "keyword",
        "index": false,
        "doc_values": false
      }
    }
  }
}

Data Prepper configuration routes attributes appropriately:

processor:
  otel_attribute_router:
    default_target: "attributes"
    routing_rules:
      - pattern: "custom.*"
        target: "attributes_flat"
      - pattern: "experimental.*"
        target: "attributes_flat"
      - pattern: "high.cardinality.*"
        target: "attributes_flat"

Fields utilized by dashboards plugin should always follow the mappings specified in otel-core-fields template.

4.6 Custom Semantic Convention Support

Organizations often extend OpenTelemetry with custom attributes specific to their domain. Our approach supports these custom semantic conventions through:

  1. Index Template Customization:
    • Index templates include extension points for custom attributes
    • Organizations can add their own mapping templates that extend the base OTel templates
  2. Registry Integration:
    • Explore this while integrating opentelemtry-weaver to generate component templates.

4.7 OTel Schema Evolution Reference

The table below summarizes common schema changes in OpenTelemetry and how they're handled:

Change Type Description Handling Strategy Example
Field Renames Attribute names change between versions Transform at ingest time; use field aliases messaging.consumer_idmessaging.consumer.id
Added Fields New attributes introduced Allow through dynamic mapping New http.request.method_original field
Type Changes Field type modifications Requires reindex because of datatype changes. String → integer conversion
Deprecations Fields marked as obsolete Maintain backward compatibility with aliases Older net.peer.ip still accessible
Namespace Changes Fields moved to different contexts Transform at ingest time; use field aliases DB-specific fields moved to common namespace

5. Telemetry Schema Version and OpenSearch Upgrade Strategy

5.1 Index Naming Convention

For all OpenTelemetry data, we recommend the following standardized index naming pattern:

ss4o_{type}-{dataset}-{namespace}-{otel_version}

Where:

  • type: Signal type (traces, logs, metrics)
  • dataset: Application or service domain (e.g., web, database, backend)
  • namespace: Environment or tenant (e.g., prod, dev, customer1)
  • otel_version: The target OpenTelemetry version (e.g., v1_18, v1_30)

Examples:

  • ss4o_traces-payment-prod-v1_30
  • ss4o_logs-webserver-staging-v1_18
  • ss4o_metrics-database-dev-v1_25

This naming strategy enables:

  • Clear separation between different telemetry types
  • Isolation of data from different OTel versions
  • Easy implementation of lifecycle policies by signal type
  • Simplified migration during upgrades

5.2 Data Prepper Schema Version Strategy

  • Maintain a consistent target_otel_version in Data Prepper to ensure reliable data correlation
  • Only update target_otel_version when upgrading OpenSearch to a version requiring newer OTel schema
  • Each OpenSearch release supports specific OTel version ranges:
    • Newer releases support newer OTel versions
    • Backward compatibility is maintained
    • Version requirements and compatibility are documented

Example:

processor:
  otel_schema_transformer:
    target_otel_version: "1.30.0"  # Keep stable unless upgrade requires change
    store_metadata: true

5.3 OpenSearch Upgrade Strategy

When upgrading OpenSearch with OTel integration, follow this strategy:

  1. Pre-Upgrade Assessment:

    • Check if current target_otel_version is supported in the new OpenSearch version
    • Review compatibility matrix to identify OTel version changes required
  2. If Target OTel Version Change Is Required:

    • Create new indices with the updated naming convention reflecting the new target OTel version
    • Example: ss4o_traces-payment-prod-v1_30 replaces ss4o_traces-payment-prod-v1_18
  3. Field Alias Configuration:

    • Apply field aliases to older indices to maintain compatibility with the new OTel version
    • Create an alias index pattern that spans both old and new indices
    • Example:
      PUT ss4o_traces-payment-prod-v1_18/_mapping
      {
        "properties": {
          "attributes.messaging.consumer.id": {
            "type": "alias",
            "path": "attributes.messaging.consumer_id"
          }
        }
      }
  4. Data Prepper Configuration Update:

    • Update Data Prepper to target the new OTel version
    • Direct new data to the new indices
    • Example:
      processor:
        otel_schema_transformer:
          target_otel_version: "1.30.0"
          store_metadata: true
        
      sink:
        opensearch:
          index: "ss4o_traces-payment-prod-v1_30"
  5. Dashboard Transition:

    • Create index patterns that span both old and new indices
    • Validate dashboards against the new indices
    • Update visualization references as needed
  6. Legacy Data Management:

    • Implement appropriate retention policies for older indices
    • Consider reindexing critical historical data to the new schema if needed

5.4 Instrumentation Library Upgrades

When telemetry sources (instrumentation libraries) upgrade their OTel version:

  1. Assess Compatibility:

    • Review changes between source OTel version and target OTel version
    • Check if Data Prepper's schema transformation can handle the differences
  2. Gradual Rollout:

    • Roll out instrumentation upgrades incrementally
    • Monitor for transformation errors or missing data

6. Next Steps

6.1 Open Questions

  1. Missing Schema URL Strategy

    For data without a schema URL, we recommend:

    • Default to the highest supported OTel version for the target OpenSearch version
    • Allow explicit configuration in Data Prepper to specify assumed version
    • Store the assumed version in the document metadata for future reference
  2. Performance Optimization

    • To minimize transformation overhead, use the OTLP collector to pre-transform high-volume telemetry

6.2 Implementation Tasks

Component Task Description
Data Prepper Schema transformation processor Implement converter supporting all OTel versions in supported range
Data Prepper Configuration options Add target_otel_version parameter and validation logic
Data Prepper Validation mechanisms Add validation for schema URL and version compatibility
OpenSearch Catalog Version compatibility matrix Create documentation of supported OTel versions per OpenSearch release
OpenSearch Catalog Index templates Update index templates to support ss4o naming convention with OTel version
Observability Plugin UI compatibility Update visualizations to handle version-specific fields
Documentation Migration guides Create guides for upgrading between versions

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions