Skip to content

[FEATURE]: Enable trace context propagation for async background tasks in LLM observability #13795

Open
@amybachir

Description

@amybachir

Package Name

No response

Package Version(s)

3.9.1

Describe the goal of the feature

Enable trace context propagation for async background tasks in LLM observability. When using asyncio.create_task() to run LLM evaluations as background tasks, the evaluation spans should remain linked to the parent trace for proper observability correlation.

Is your feature request related to a problem?

Yes. When creating async background tasks for LLM-as-a-judge evaluations, the Datadog trace context is not automatically propagated to the background task. This prevents proper trace linking between the main LLM call and the evaluation LLM call, even when using LLMObs.export_span() and passing the exported span to LLMObs.submit_evaluation_for().

Current behavior:

  • Background evaluation tasks create new, unlinked traces instead of child spans
  • Evaluation metrics cannot be correlated with original LLM requests
  • LLMObs.submit_evaluation_for() works but metrics appear isolated in dashboards
  • No way to maintain trace continuity across asyncio.create_task() boundaries

This blocks proper observability for production LLM services where background evaluations are essential for performance (evaluations take 1-3 seconds and cannot block user responses).

Describe alternatives you've considered

  1. Synchronous evaluation: Unacceptable due to 1-3 second latency impact on user responses
  2. Using LLMObs.export_span(): Doesn't propagate trace context to background tasks
  3. Manual correlation via tags: Loses the benefits of proper trace linking and span relationships
  4. Separate evaluation service: Adds complexity and still doesn't solve trace correlation
  5. Using contextvars manually: Complex and error-prone without official LLMObs support

None of these alternatives provide the clean trace continuity needed for production observability.

Additional context

Real-world production use case: LLM service that processes user requests with primary LLM calls, then runs quality evaluations in background using judge LLMs.

Code example showing the issue:

# Main LLM call with proper tracing
with LLMObs.llm(name="invoke_llm", model_name=model_name, session_id=session_id) as task_span:
    llm_response = await llm_client.call(llm_context)
    
    # Export span for evaluation
    task_span_exported = LLMObs.export_span(task_span)
    
    # Background evaluation task - loses trace context!
    asyncio.create_task(
        evaluate_prediction_async(span=task_span_exported)
    )

# Inside background task - creates unlinked trace
async def evaluate_prediction_async(span, **kwargs):
    with LLMObs.llm(name="eval_llm", model_name=eval_model, session_id=session_id) as eval_span:
        # This span is not linked to the original trace
        LLMObs.submit_evaluation_for(span=span, label="is_correct", value="YES")

Proposed solutions:

  • LLMObs.capture_trace_context() + LLMObs.with_trace_context()
  • Enhanced submit_evaluation_for() with propagate_context=True
  • @LLMObs.preserve_trace_context decorator for background tasks

Environment: Python 3.11+, ddtrace with LLM Observability, asyncio + FastAPI

This pattern is becoming standard for performance-critical LLM services that need background evaluation processing.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions