testing infra: introspection & telemetry #589
Description
As part of our e2e testing story, we want to be able to get some introspection on what's going on in each component of radicle-link
so that when we begin to test more complex interactions we can trace what happened along the way to help us debug.
As part of this, we should weigh in on this proposal to decide on what we should do about tracing
and our logging in general.
The overall aim of this introspection is to allow us to get an insight into a component and we should be informed by our past woes:
- Deadlocks occur when two peers are talking to each other. For these kinds of cases, we would want telemetry on what resources have been handed out and how many are in flight, e.g.
Pooled<Storage>
,Connection
from/to, whatUpgrade
- We expect to see gossip messages for a specific
Urn
andPeerId
, and then we should see replication (or not) - We expect to see a specific set of peers as connected in the membership
When thinking about the design of this introspection we won't want to litter the core components with a bunch of I/O, but rather the core components should be able to calculate the interesting data that can be reported from the I/O layer that makes calls to the core components. For example, when replicating, the fetch reports which tips were updated and this can be returned to the next layer up which ultimately decides to report it.
This is all a bit hand-wavey, I know :), so I'd like to flesh these ideas out more here.