-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NTP synchronization by using Kronos. #54
Conversation
…n a new Opentelemetry Clock and uses it.
init() { | ||
Kronos.Clock.sync() | ||
} | ||
|
||
var now: Date { | ||
return Kronos.Clock.now ?? Date() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mind this trap: DataDog/dd-sdk-ios#588 - Kronos
requires that both Clock.sync()
and Clock.now
must be called on the main thread (@maxep gave more context here: MobileNativeFoundation/Kronos#85 (comment)). Is this ensured by design in dd-swift-testing
? If not, I think this could be failing if TSAN is enabled in tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, actually only Clock.now
should be invoked from the main thread because all sync callbacks runs on CFRunLoopGetMain
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, thanks a lot. Definitely needs another implementation then.
@@ -54,6 +54,7 @@ internal class DDTracer { | |||
tracerProvider.updateActiveSampler(Samplers.alwaysOn) | |||
let spanLimits = tracerProvider.getActiveSpanLimits().settingAttributeCountLimit(1024) | |||
tracerProvider.updateActiveSpanLimits(spanLimits) | |||
tracerProvider.updateActiveClock(NTPClock()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have enough visibility on how otel-swift
uses the Clock.now
, but one thing we had to mitigate in dd-sdk-ios
was Kronos.Clock.now
being not monotonic until .sync()
resolves all calls to the NTP pool.
In other words, this:
let startTime = Kronos.Clock.now
Thread.sleep(0.1)
let delta = Kronos.Clock.now - startTime
might give negative delta
for the first attempts when NTP sync is being performed and device time is ahead of the NTP time.
When browsing this (start) and this (end) code in otel-swift
, I'm not sure if this is not the case here. Is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
otel-swift
uses a monotonic clock for each span, which is initialized with the value of the active clock (Kronos clock in this case) that is only called once per span at the start, so durations should never be negative. But those spans can be started from any thread, so I cannot use the Kronos.Clock.now
directly, I must evaluate if performance will suffer much from calling it in the main thread since I only call it once per span
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will start with the simplest implementation, in my framework most of the spans will start at main thread (XCTest observer), and performance should not be so critical
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's definitely the easiest solution 👍, will work if you can make this threading assumption. Otherwise, we solved this (in DataDog/dd-sdk-ios#607) by using our ValuePublisher
which uses DispatchQueue
. This comes with lock penalty, which we mitigate by using concurrent queue with .barrier
for making (rare) writes exclusive and allowing (frequent) simultaneous reads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I find my solution performance is not good enough I will try to change to your solution, its way better than mine.
What and why?
Use a NTP synchronized clock for reporting the tests instead of default system time.
How?
It creates a new Opentelemetry clock and wraps a Kronos clock with it returning own Kronos result, it is then configured in Opentelemetry to be used as the reference clock
Review checklist