-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Problem Statement
I'm exploring ways to optimize the trace functionality and wanted to gauge the appetite for optimization.
I'm willing to send PRs.
Proposed Solution
There are a few areas I've exprimented with in https://github.com/jschaf/observe. Ranked roughly in terms of feasibility, impact, and long term maintenance costs.
-
Use atomics for Span.End instead of a mutex. Provides about a 1.5x speed up on the Span.End benchmark. As a side benefit, reduces the Span size by 16 bytes since time.Time takes 24 bytes.
-
Inline hex encoding and decoding for TraceID and SpanID. Provides about a 2x speedup for parsing (27 ns to 12 ns) and reduces allocations to 0.
-
Zero allocation state parsing. This is a fair bit more complex than the current parsing, but has about the same line count. Provides about a 1.8x speed up and reduces allocations to zero.
-
[Still baking] Get syscall.Gettimeofday directly instead of time.Now. Saves about 15 ns compare to 30 ns for time.Now. I suspect there's a bunch of unhandled complexity here. It'd be swell if there was an alternative time library that handled this.
Alternatives
Doing nothing is perfectly reasonable.
Prior Art
https://www.pingcap.com/blog/how-we-trace-a-kv-database-with-less-than-5-percent-performance-impact/: They squeeze out performance using time-stamp counters and thread local state, both of which would be difficult to emulate in Go.