Use information from the cache to help speed up looking up directory subspaces #107

KrzysFR · 2020-05-24T12:17:32Z

Right now, calling "Resolve" on a subspace location will, by default, use a cached directory subspace (and set a value-check to protect the transaction).

When the caller wants to make sure the the directory is valid, calling OpenAsync/TryOpenAsync on the directory will issues sequential reads to the cluster which, for deeply nested paths, may take a while.

If the directory is already in the cache, we may reuse the list of key/value pairs used to setup the value-checks, as a quick way to open the directory subspaces, by reading all the keys at once (via GetValues) and checking the result.

If the cached data is still valid, then in one hop we read all the nodes from the root to the leaf, in a single network hop.
If the cached data was stale, then we only "waste" a single network hop. As a bonus point, we may be able to detect which nodes are still valid and only remove and re-read the failed nodes?

If this works well, it means we could end up we 3 different "performance modes"

Fully cached: protected with value-checks. Code that use this mode will need to be aware that it could observe corrupted data during the first attempt, and will be retried at least once. In this mode, the read I/O of the value-checks will be merged with the read I/O of the transaction.
Quick check using the cache: Code will be ensure that the subspace returned is valid, but it will have to wait one network round-trip (if cache is still valid), up to N round-trips for a path of depth N (if cache is invalid). Downside is that opening multiple directories in separate calls will not be able to merge the latency into one round-trip!
Slow path: where opening a directory subspaces always incur the full cost, on every call.

The text was updated successfully, but these errors were encountered:

KrzysFR added enhancement layer:directory Directory Subspace Layer labels May 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use information from the cache to help speed up looking up directory subspaces #107

Use information from the cache to help speed up looking up directory subspaces #107

KrzysFR commented May 24, 2020

Use information from the cache to help speed up looking up directory subspaces #107

Use information from the cache to help speed up looking up directory subspaces #107

Comments

KrzysFR commented May 24, 2020