[Storage] Managed download perf: runtime task per-chunk & separate download work queues#3950
[Storage] Managed download perf: runtime task per-chunk & separate download work queues#3950jaschrep-msft wants to merge 17 commits intoAzure:mainfrom
Conversation
|
Resolved offline.
|
8c2302b to
fa61608
Compare
There was a problem hiding this comment.
Pull request overview
This PR refactors the blob managed download implementation to improve throughput by spawning per-chunk download tasks via azure_core’s runtime abstraction and by separating chunk downloading from resequencing/buffering in the returned stream.
Changes:
- Reworked
partitioned_transfer::download()to schedule chunk downloads as spawned tasks and resequence outputs via an indexed ring buffer. - Added helper utilities for collecting streamed bytes into a pre-allocated buffer and for handling invalid initial range requests.
- Added
async-streamas a dependency to implement the new streaming logic.
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs | Major download pipeline refactor: per-chunk task spawning, bounded resequencing buffer, new helpers. |
| sdk/storage/azure_storage_blob/Cargo.toml | Adds async-stream dependency for the new try_stream! implementation. |
| Cargo.lock | Locks the new async-stream dependency. |
Comments suppressed due to low confidence (1)
sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs:28
- This module relies on
use super::*;to bring in key names likefuture/TryStreamExt(used byfuture::select_allandtry_next) rather than importing them locally. That hidden coupling makes the file fragile ifpartitioned_transfer::mod.rschanges. Consider adding explicit imports here for the items you use to keep the module self-contained.
use futures::TryStream;
use crate::models::http_ranges::ContentRange;
use super::*;
You can also share your feedback on Copilot code review. Take the survey.
sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs
Outdated
Show resolved
Hide resolved
sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs
Outdated
Show resolved
Hide resolved
sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs
Outdated
Show resolved
Hide resolved
sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs
Outdated
Show resolved
Hide resolved
|
Sorry, I clicked the wrong button here. |
LarryOsterman
left a comment
There was a problem hiding this comment.
My biggest concern is the lack of documentation explaining the algorithm for download - this is a complicated bit of code and it was challenging to understand it.
sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs
Outdated
Show resolved
Hide resolved
| let dst = BytesMut::with_capacity(range.len()); | ||
| let response = client.transfer_range(Some(range)).await?; | ||
| response.into_body().collect().await | ||
| collect_into(response.into_body(), dst).await |
There was a problem hiding this comment.
When the core collect_into PR completes, this can be replaced with response.into_body().collect_into(dst).await, I believe.
There was a problem hiding this comment.
That PR is in, I believe this can be replaced with:
response.into_body().collect_into(dst).await?
One minor complication is that the collect_into call can fail if the provided buffer isn't sufficient to hold received chunks. It will fill up to the buffer, and return the actual amount of data received (if the stream ends before the buffer is filled).
If it cannot be replaced, let me know how I can fix the collect_into function to better meet your needs.
sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs
Outdated
Show resolved
Hide resolved
|
@LarryOsterman yeah after some of the generated review comments require even more logic to be added, i have been factoring out several parts of this code to simplify the actual contents of the loop. |
LarryOsterman
left a comment
There was a problem hiding this comment.
This is a significant improvement, thank you very much.
Creating a new base branch to isolate independent changes to download behavior and easily explore their performance differences. Some changes have been made because they are already known to be universally good, but they have not yet been merged into main. - pre-size chunk destination buffer. AsyncResponseBody::collect() is not smart enough to do this. - handle ranged get on empty blob. We gotta do this at some point no matter what, may as well get accurate perf readings with that additional work. - separate functions for analyzing response headers. Moves some bulky checks out of the way of the real download logic. Also good code reuse for alternate download implementations which may be necessary.
Tokio is now a default feature flag of core. Remove the manual specification. Doubles as validating our our out-of-box experience.
5919fe4 to
a5fa059
Compare
Major performance improvements for managed download.
parallelbound.parallelbound.Credit to @nateprewitt's initial implementation I ported over to use the tools available in our dependency chain.