Skip to content

[tempo] gRPC Connection Issues with StreamingQuerier Service #3761

Open
@rgarcia89

Description

@rgarcia89

https://github.com/grafana/helm-charts/blob/99025e8dc6291d2ca5b276c6647a552ad6cc1e50/charts/tempo/values.yaml#L220C11-L220C16

Summary

Experiencing connection errors between Tempo components due to protocol mismatches and service registration issues. The primary error is:

rpc error: code = Unimplemented desc = unknown service tempopb.StreamingQuerier

Previously saw:

rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: http2: failed reading the frame payload: %!w(<nil>), note that the frame header looked like an HTTP/1.1 header"

Environment

  • Kubernetes deployment using Helm chart
  • Tempo version: 2.8.0
  • Helm chart version: 1.23.1

Investigation Details

Port Configuration Issues

  • Service exposes ports 16686/16687 for Jaeger UI/metrics
  • Internal netstat shows port 7777 is listening (with no PID)
  • Port 9095 is also listening (gRPC)
  • Service definition maps to ports 16686/16687, not to 7777

Protocol Mismatch

  • Initial error indicated HTTP/1.1 vs HTTP/2 protocol mismatch
  • Added stream_over_http_enabled: true at root level of config
  • Now getting "unknown service tempopb.StreamingQuerier" error

Connection Tests

  • Direct curl to port 7777 returns "Received HTTP/0.9 when not allowed"
  • Suggests gRPC endpoint being accessed with HTTP client

Questions

  1. Why is the service mapping to ports 16686/16687 when the application listens on 7777?
  2. Why isn't the StreamingQuerier service registered on the expected endpoint?
  3. Is there a configuration issue with the query frontend component?

Attempted Solutions

  1. Added stream_over_http_enabled: true to root config
  2. Verified port configurations and service mappings
  3. Checked for proper component registration

Impact

Unable to perform trace searches, resulting in errors in the Grafana logs and potential service disruption.

Image

Additional Context

The issue appears to be related to how the Tempo components are configured to communicate with each other, specifically around the query frontend and querier components.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions