Skip to content

Conversation

zhengkezhou1
Copy link
Contributor

Which problem is this PR solving?

  • The ch-go library does not support multi-instance writing. This could lead to performance bottlenecks. We have chosen to use clickhouse-go instead.

Description of the changes

  • Write directly using DBmodel instead of converting to proto.input as required by ch-go.
  • Removed all related dependencies for ch-go.

How was this change tested?

  • unit tests

Checklist

@zhengkezhou1 zhengkezhou1 requested a review from a team as a code owner May 3, 2025 10:09
@zhengkezhou1 zhengkezhou1 requested a review from albertteoh May 3, 2025 10:09
Copy link

codecov bot commented May 3, 2025

Codecov Report

❌ Patch coverage is 98.09524% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 96.15%. Comparing base (7d66726) to head (0bfe9f0).
⚠️ Report is 230 commits behind head on main.

Files with missing lines Patch % Lines
...age/v2/clickhouse/tracestore/dbmodel/to_dbmodel.go 97.29% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7093      +/-   ##
==========================================
- Coverage   96.20%   96.15%   -0.06%     
==========================================
  Files         358      358              
  Lines       21596    21300     -296     
==========================================
- Hits        20777    20480     -297     
  Misses        613      613              
- Partials      206      207       +1     
Flag Coverage Δ
badger_v1 9.90% <ø> (ø)
badger_v2 2.05% <ø> (ø)
cassandra-4.x-v1-manual 14.89% <ø> (ø)
cassandra-4.x-v2-auto 2.04% <ø> (ø)
cassandra-4.x-v2-manual 2.04% <ø> (ø)
cassandra-5.x-v1-manual 14.89% <ø> (ø)
cassandra-5.x-v2-auto 2.04% <ø> (ø)
cassandra-5.x-v2-manual 2.04% <ø> (ø)
elasticsearch-6.x-v1 20.23% <ø> (ø)
elasticsearch-7.x-v1 20.31% <ø> (ø)
elasticsearch-8.x-v1 20.49% <ø> (ø)
elasticsearch-8.x-v2 2.05% <ø> (ø)
grpc_v1 11.44% <ø> (ø)
grpc_v2 2.05% <ø> (ø)
kafka-3.x-v1 10.17% <ø> (ø)
kafka-3.x-v2 2.05% <ø> (ø)
memory_v2 2.05% <ø> (ø)
opensearch-1.x-v1 20.36% <ø> (ø)
opensearch-2.x-v1 20.36% <ø> (ø)
opensearch-2.x-v2 2.05% <ø> (ø)
query 2.05% <ø> (ø)
tailsampling-processor 0.55% <ø> (ø)
unittests 94.95% <98.09%> (-0.08%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

TraceState: sp.TraceState().AsRaw(),
Name: sp.Name(),
Kind: sp.Kind().String(),
Duration: sp.EndTimestamp().AsTime(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if OTEL captures timestamp, why do we capture duration in CH?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean we should keep startTime and endTime?

@zhengkezhou1 zhengkezhou1 marked this pull request as draft May 4, 2025 11:23
@zhengkezhou1 zhengkezhou1 reopened this May 4, 2025
@zhengkezhou1 zhengkezhou1 marked this pull request as ready for review May 4, 2025 11:23
The `Value` type here actually refers to the `pdata` data types from the `otel-collector` pipeline. In our architecture, the `value_warpper` is responsible for wrapping the Protobuf-generated Go structures (which are the concrete implementation of `pdata`) into the `Value` type. Although `pdata` itself is based on the OTLP specification, encapsulating it into `Value` via the `value_warpper` creates a higher-level abstraction, which presents some challenges for directly storing `Value` in ClickHouse. Specifically, when deserializing `Slice` and `Map` data contained within the `Value`, the fact that JSON cannot natively distinguish whether a `Number` is an integer (`int`) or a floating-point number (`double`) leads to a loss of type information. Furthermore, directly handling the potentially dynamically nested `pdata` structures within the `Value` can also be quite complex. Therefore, to ensure the accuracy and completeness of data types in ClickHouse, and to effectively handle these nested telemetry data, we need to convert the `pdata` data inside `Value` into the standard `OTLP/JSON` format for storage.
#### Mapping model to DB storage
The table structure is defined as follows:
``` sql
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is it in the README? Don't we need this in the code to be executed against CH? The readme can just refer to that code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, moved it to schema.tmpl and provided a link.

Comment on lines 47 to 48
SpanAttributesBoolKey Array(String),
SpanAttributesBoolValue Array(Bool),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: inaccurate naming, since each field is an array they should have plural names, e.g. SpanAttributesBoolKeys

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, these names will be constantly going back and forth between server and CH, so maybe keep them shorter: SpanAttrBoolKeys

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improved.

LinksAttributesBytesValues Array(Array(Array(byte))),
```
`TraceID` is actually a fixed-length `[16]byte`, and `SpanID` is a fixed-length `[8]byte`.
While it's possible to store them using `Array(byte)`, using the `FixedString(16)` and `FixedString(8)` types can express the data length more precisely.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment doesn't reflect the declaration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add more context now.

Comment on lines 1 to 13
// Copyright (c) 2025 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package dbmodel

import (
"time"
)

// EventsRow is a collection of Trace.Events fields. It maps one-to-one with the fields in the table structure.
// When writing Trace.Events data, it needs to be first converted to the corresponding EventsRow. When reading, it's first mapped to EventsRow
// and then converted back to a collection of Trace.Events.
type EventsRow struct {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usage case:
writing traces.

batch, _ := conn.PrepareBatch(context.Background(), "INSERT INTO otel_traces")

	traces := dbmodel.ToDBModel(SimpleTraces(2))

	for _, trace := range traces {
		resourceRow := trace.Resource
		scopeRow := trace.Scope
		spanRow := trace.Span
		eventsRow := dbmodel.ToEventsRow(trace.Events)
		eventsAllAttrRow := dbmodel.ToAllNestedAttrRow(eventsRow.NestedAttrRow)
		linksRow := dbmodel.ToLinksRow(trace.Links)
		linksAllAttrRow := dbmodel.ToAllNestedAttrRow(linksRow.NestedAttrRow)
		err := batch.Append(
			spanRow.Timestamp,
			spanRow.TraceId,
			spanRow.SpanId,
			spanRow.ParentSpanId,
			spanRow.TraceState,
			spanRow.Name,
			spanRow.Kind,
			spanRow.Duration,
			spanRow.StatusCode,
			spanRow.StatusMessage,
			spanRow.Attributes.BoolKeys,
			spanRow.Attributes.BoolValues,
			spanRow.Attributes.DoubleKeys,
			spanRow.Attributes.DoubleValues,
			spanRow.Attributes.IntKeys,
			spanRow.Attributes.IntValues,
			spanRow.Attributes.StrKeys,
			spanRow.Attributes.StrValues,
			spanRow.Attributes.BytesKeys,
			spanRow.Attributes.BytesValues,

			scopeRow.Name,
			scopeRow.Version,
			scopeRow.Attributes.BoolKeys,
			scopeRow.Attributes.BoolValues,
			scopeRow.Attributes.DoubleKeys,
			scopeRow.Attributes.DoubleValues,
			scopeRow.Attributes.IntKeys,
			scopeRow.Attributes.IntValues,
			scopeRow.Attributes.StrKeys,
			scopeRow.Attributes.StrValues,
			scopeRow.Attributes.BytesKeys,
			scopeRow.Attributes.BytesValues,

			resourceRow.Attributes.BoolKeys,
			resourceRow.Attributes.BoolValues,
			resourceRow.Attributes.DoubleKeys,
			resourceRow.Attributes.DoubleValues,
			resourceRow.Attributes.IntKeys,
			resourceRow.Attributes.IntValues,
			resourceRow.Attributes.StrKeys,
			resourceRow.Attributes.StrValues,
			resourceRow.Attributes.BytesKeys,
			resourceRow.Attributes.BytesValues,

			eventsRow.Names,
			eventsRow.Timestamps,
			eventsAllAttrRow.BoolAttrs,
			eventsAllAttrRow.DoubleAttrs,
			eventsAllAttrRow.IntAttrs,
			eventsAllAttrRow.StrAttrs,
			eventsAllAttrRow.BytesAttrs,

			linksRow.TraceIds,
			linksRow.SpanIds,
			linksRow.TraceStates,
			linksAllAttrRow.BoolAttrs,
			linksAllAttrRow.DoubleAttrs,
			linksAllAttrRow.IntAttrs,
			linksAllAttrRow.StrAttrs,
			linksAllAttrRow.BytesAttrs,
		)
		if err != nil {
			panic(err)
		}
	}
	batch.Send()

reading traces.

rows, err := conn.Query(context.Background(), "SELECT * FROM otel_traces")

	if err != nil {
		panic(err)
	}

	var result []dbmodel.Trace
	for rows.Next() {
		var trace dbmodel.Trace

		var eventsName []string
		var eventsTimestamp []time.Time
		var eventsAllNestedAttrs dbmodel.AllNestedAttrRow
		var linksTraceId [][]byte
		var linksSpanId [][]byte
		var linksTraceStatus []string
		var linksAllNestedAttrs dbmodel.AllNestedAttrRow
		err = rows.Scan(
			&trace.Span.Timestamp,
			&trace.Span.TraceId,
			&trace.Span.SpanId,
			&trace.Span.ParentSpanId,
			&trace.Span.TraceState,
			&trace.Span.Name,
			&trace.Span.Kind,
			&trace.Span.Duration,
			&trace.Span.StatusCode,
			&trace.Span.StatusMessage,
			&trace.Span.Attributes.BoolKeys,
			&trace.Span.Attributes.BoolValues,
			&trace.Span.Attributes.DoubleKeys,
			&trace.Span.Attributes.DoubleValues,
			&trace.Span.Attributes.IntKeys,
			&trace.Span.Attributes.IntValues,
			&trace.Span.Attributes.StrKeys,
			&trace.Span.Attributes.StrValues,
			&trace.Span.Attributes.BytesKeys,
			&trace.Span.Attributes.BytesValues,

			&trace.Scope.Name,
			&trace.Scope.Version,
			&trace.Scope.Attributes.BoolKeys,
			&trace.Scope.Attributes.BoolValues,
			&trace.Scope.Attributes.DoubleKeys,
			&trace.Scope.Attributes.DoubleValues,
			&trace.Scope.Attributes.IntKeys,
			&trace.Scope.Attributes.IntValues,
			&trace.Scope.Attributes.StrKeys,
			&trace.Scope.Attributes.StrValues,
			&trace.Scope.Attributes.BytesKeys,
			&trace.Scope.Attributes.BytesValues,

			&trace.Resource.Attributes.BoolKeys,
			&trace.Resource.Attributes.BoolValues,
			&trace.Resource.Attributes.DoubleKeys,
			&trace.Resource.Attributes.DoubleValues,
			&trace.Resource.Attributes.IntKeys,
			&trace.Resource.Attributes.IntValues,
			&trace.Resource.Attributes.StrKeys,
			&trace.Resource.Attributes.StrValues,
			&trace.Resource.Attributes.BytesKeys,
			&trace.Resource.Attributes.BytesValues,

			&eventsName,
			&eventsTimestamp,
			&eventsAllNestedAttrs.BoolAttrs,
			&eventsAllNestedAttrs.DoubleAttrs,
			&eventsAllNestedAttrs.IntAttrs,
			&eventsAllNestedAttrs.StrAttrs,
			&eventsAllNestedAttrs.BytesAttrs,

			&linksTraceId,
			&linksSpanId,
			&linksTraceStatus,
			&linksAllNestedAttrs.BoolAttrs,
			&linksAllNestedAttrs.DoubleAttrs,
			&linksAllNestedAttrs.IntAttrs,
			&linksAllNestedAttrs.StrAttrs,
			&linksAllNestedAttrs.BytesAttrs,
		)

		if err != nil {
			panic(err)
		}

		events := make([]dbmodel.Event, len(eventsName))
		eventsAttrRow := dbmodel.FromAllNestedAttrRow(eventsAllNestedAttrs)
		for i := 0; i < len(eventsName); i++ {

			events[i].Name = eventsName[i]
			events[i].Timestamp = eventsTimestamp[i]
			events[i].Attributes = dbmodel.FromNestedAttrRow(eventsAttrRow[i])
		}
		trace.Events = events

		links := make([]dbmodel.Link, len(linksTraceId))
		linksAttrRow := dbmodel.FromAllNestedAttrRow(linksAllNestedAttrs)
		for i := 0; i < len(linksTraceId); i++ {
			links[i].TraceId = linksTraceId[i]
			links[i].SpanId = linksSpanId[i]
			links[i].TraceState = linksTraceStatus[i]
			links[i].Attributes = dbmodel.FromNestedAttrRow(linksAttrRow[i])
		}
		trace.Links = links

		result = append(result, trace)
	}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, but what did you try to convey by this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to demonstrate how the new file row.go can be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can discard this file; it seems unnecessary this time. This PR is so large.

Events Nested(
Name String,
Timestamp DateTime64(9),
BoolAttrs Nested (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not use the same pattern for top-level attribute fields?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Do you mean use Nested struct for resource, scope and span Attributes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean using Nested for each pair of key/value arrays


##### Resource,Scope,Span
`TraceID` is actually a fixed-length `[16]byte`, and `SpanID` is a fixed-length `[8]byte`.
While it's possible to store them using `Array(UInt8)`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this "while" clause refer to? Seems out of place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This README contains too many mistakes; I plan to rewrite it.

Signed-off-by: zhengkezhou1 <[email protected]>
Copy link

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. You may re-open it if you need more time.

@github-actions github-actions bot added the stale The issue/PR has become stale and may be auto-closed label Jul 28, 2025
type Link struct {
TraceId string
SpanId string
ptrace.SpanEvent
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Link struct incorrectly embeds ptrace.SpanEvent when it should be using ptrace.SpanLink or no embedding at all. This is conceptually incorrect as links and events represent different concepts in the OpenTelemetry model - links reference other spans while events are timestamped annotations within a span. This embedding could lead to unexpected behavior or confusion when working with the model.

Suggested change
ptrace.SpanEvent
ptrace.SpanLink

Spotted by Diamond

Is this helpful? React 👍 or 👎 to let us know.

@github-actions github-actions bot removed the stale The issue/PR has become stale and may be auto-closed label Aug 4, 2025
Copy link

github-actions bot commented Oct 6, 2025

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. You may re-open it if you need more time.

@github-actions github-actions bot added the stale The issue/PR has become stale and may be auto-closed label Oct 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/storage performance stale The issue/PR has become stale and may be auto-closed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants