-
Notifications
You must be signed in to change notification settings - Fork 0
Fix chunk cache key and metadata retrieval #104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…nloaded byte range under a key that doesn't account for this range. When another request is received we get a cache hit even though the requested byte range may differ. Fix this by downloading and caching the entire chunk, applying the byte range post download or cache hit. If caching is disabled honour the byte range in the S3 client download.
…ache chunks aren't shared between requests of differing byte range. Revert the previous fix which suffers from memory over consumption when bombarded by requests for byte ranges from the same very large file - we would need a way to ensure a single download of that file before servicing concurrent requests against it.
1) Only incorporate request fields that themselves are used in the S3 object download 2) Immediately turn this key into a md5 hash so we're handling a much shorter string when interacting with the cache
…essing task to determine if a chunk is cached and if so its size. There's a MPSC channel used to buffer all cache write requests to a single task responsible for cache updates but the update task could happen at the same time request processing tasks read the state. Reading the state file whilst it's being written results in a serde parsing error. Instead of adding some type of thread safety around the load/save of the state file a simpler solution is to store a chunk's metadata in its own metadata file. This mirrors the chunk get which ignores the state file and simply retrieves files from disk.
sd109
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, seems like a sensible approach to me.
Do we also need to be worried about the pruning cycle trying to access the cache state file while the cache is being written to? I see we're using load_state in the remove method as well as in various places throughout the prune* methods.
2638d03 to
eed4a06
Compare
eed4a06 to
64a4e24
Compare
…chunks independently, two users requesting the same chunk will cache it independently with no sharing involved - this is the fastest way to enable authentication by default
Fixes for the chunk cache: