[Storage] Managed download perf: runtime task per-chunk & separate download work queues by jaschrep-msft · Pull Request #3950 · Azure/azure-sdk-for-rust

jaschrep-msft · 2026-03-13T17:41:22Z

Major performance improvements for managed download.

Uses Core's runtime abstraction to spawn workers for each individual chunk of the download.
- If download is a one-shot, does not spawn any workers. Runs async on whatever task the caller spawned the download on.
Separates the work of downloading chunks from resequencing chunks to return in the overall download stream.
- Active chunk downloads capped by existing parallel bound.
- Buffers waiting to be re-sequenced capped at 2x parallel bound.
Completed chunks stored in ring buffer waiting to be returned in overall stream.
Download tasks tagged with index for resequencing, allowing them to be placed in the correct position in the ring buffer.

Credit to @nateprewitt's initial implementation I ported over to use the tools available in our dependency chain.

jaschrep-msft · 2026-03-13T17:55:28Z

Resolved offline.

~~@heaths & @LarryOsterman, major question before marking this ready.~~

This implementation currently panics out-of-box. To successfully complete, the caller must use the tokio runtime to call this download and the app must have the tokio feature flag enabled for azure_core. This is due to reqwest's tight-binding to tokio; it also spawns tasks directly from tokio. If Core's get_async_runtime() does not return a tokio implementation, reqwest/hyper will panic. If the caller to download isn't already in a tokio runtime, there will be a panic either by us or reqwest/hyper, depending on feature flags.

~~We need the ability to spawn tasks to achieve our target speed, and it seems spawning tasks that contain network calls require tokio all the way down.~~

How do we handle this?

I'm not well-experienced in the ecosystem, but it seems the way to deal with this is to introduce a feature flag for tokio the storage sdk that will enable the code in this PR, and then the absence of this flag will fall back to the previous implementation. This gets us our target perf for tokio users (most users). Other runtimes could possibly get their own flags as needed in the future.

~~Are there other preferred mechanisms? Can we tie storage's tokio flag to core's tokio flag? What do we think is the path forward here?~~

Copilot

Pull request overview

This PR refactors the blob managed download implementation to improve throughput by spawning per-chunk download tasks via azure_core’s runtime abstraction and by separating chunk downloading from resequencing/buffering in the returned stream.

Changes:

Reworked partitioned_transfer::download() to schedule chunk downloads as spawned tasks and resequence outputs via an indexed ring buffer.
Added helper utilities for collecting streamed bytes into a pre-allocated buffer and for handling invalid initial range requests.
Added async-stream as a dependency to implement the new streaming logic.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 4 comments.

File	Description
sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs	Major download pipeline refactor: per-chunk task spawning, bounded resequencing buffer, new helpers.
sdk/storage/azure_storage_blob/Cargo.toml	Adds `async-stream` dependency for the new `try_stream!` implementation.
Cargo.lock	Locks the new `async-stream` dependency.

Comments suppressed due to low confidence (1)

sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs:28

This module relies on use super::*; to bring in key names like future/TryStreamExt (used by future::select_all and try_next) rather than importing them locally. That hidden coupling makes the file fragile if partitioned_transfer::mod.rs changes. Consider adding explicit imports here for the items you use to keep the module self-contained.

use futures::TryStream;

use crate::models::http_ranges::ContentRange;

use super::*;

You can also share your feedback on Copilot code review. Take the survey.

sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs

demoray · 2026-03-17T19:08:30Z

Sorry, I clicked the wrong button here.

LarryOsterman

My biggest concern is the lack of documentation explaining the algorithm for download - this is a complicated bit of code and it was challenging to understand it.

sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs

LarryOsterman · 2026-03-17T20:25:53Z

sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs

+    let dst = BytesMut::with_capacity(range.len());
    let response = client.transfer_range(Some(range)).await?;
-    response.into_body().collect().await
+    collect_into(response.into_body(), dst).await


When the core collect_into PR completes, this can be replaced with response.into_body().collect_into(dst).await, I believe.

That PR is in, I believe this can be replaced with:

response.into_body().collect_into(dst).await?

One minor complication is that the collect_into call can fail if the provided buffer isn't sufficient to hold received chunks. It will fill up to the buffer, and return the actual amount of data received (if the stream ends before the buffer is filled).

If it cannot be replaced, let me know how I can fix the collect_into function to better meet your needs.

sdk/storage/azure_storage_blob/src/partitioned_transfer/download.rs

jaschrep-msft · 2026-03-18T16:03:40Z

@LarryOsterman yeah after some of the generated review comments require even more logic to be added, i have been factoring out several parts of this code to simplify the actual contents of the loop.

LarryOsterman

This is a significant improvement, thank you very much.

Creating a new base branch to isolate independent changes to download behavior and easily explore their performance differences. Some changes have been made because they are already known to be universally good, but they have not yet been merged into main. - pre-size chunk destination buffer. AsyncResponseBody::collect() is not smart enough to do this. - handle ranged get on empty blob. We gotta do this at some point no matter what, may as well get accurate perf readings with that additional work. - separate functions for analyzing response headers. Moves some bulky checks out of the way of the real download logic. Also good code reuse for alternate download implementations which may be necessary.

Tokio is now a default feature flag of core. Remove the manual specification. Doubles as validating our our out-of-box experience.

github-actions bot added the Storage Storage Service (Queues, Blobs, Files) label Mar 13, 2026

jaschrep-msft force-pushed the separate-download-work-queues branch 2 times, most recently from 8c2302b to fa61608 Compare March 17, 2026 15:24

jaschrep-msft marked this pull request as ready for review March 17, 2026 16:34

jaschrep-msft requested review from LarryOsterman, RickWinter, heaths, jalauzon-msft and vincenttran-msft as code owners March 17, 2026 16:34

Copilot AI review requested due to automatic review settings March 17, 2026 16:34

Copilot started reviewing on behalf of jaschrep-msft March 17, 2026 16:35 View session

Copilot AI reviewed Mar 17, 2026

View reviewed changes

demoray closed this Mar 17, 2026

demoray reopened this Mar 17, 2026

LarryOsterman requested changes Mar 17, 2026

View reviewed changes

jaschrep-msft requested a review from LarryOsterman March 18, 2026 18:21

LarryOsterman approved these changes Mar 18, 2026

View reviewed changes

jaschrep-msft added 10 commits March 18, 2026 16:45

checkpoint

95a17d2

tokio feature flag test dependency

3545c7b

remove unwrap

bd0dad4

Remove dev dependency

f5e2b51

Tokio is now a default feature flag of core. Remove the manual specification. Doubles as validating our our out-of-box experience.

Prune task join handles and use awaitable channel

39678b2

initial download to buffer logic into dedicated method.

d9be124

download to buffer logic into dedicated func

8fe4f30

remove dead code

36228c4

spelling

5c7e8be

jaschrep-msft added 7 commits March 18, 2026 16:46

update description

67dcb25

encapsulate drain logic

0c93e93

revert

a7e69ed

better docs/naming

c762956

more docs

08bd6b6

fix drain

85e4220

use new collect_into on response body

a5fa059

jaschrep-msft force-pushed the separate-download-work-queues branch from 5919fe4 to a5fa059 Compare March 18, 2026 20:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Storage] Managed download perf: runtime task per-chunk & separate download work queues#3950

[Storage] Managed download perf: runtime task per-chunk & separate download work queues#3950
jaschrep-msft wants to merge 17 commits intoAzure:mainfrom
jaschrep-msft:separate-download-work-queues

jaschrep-msft commented Mar 13, 2026

Uh oh!

jaschrep-msft commented Mar 13, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

demoray commented Mar 17, 2026

Uh oh!

LarryOsterman left a comment

Uh oh!

Uh oh!

LarryOsterman Mar 17, 2026

Uh oh!

LarryOsterman Mar 18, 2026

Uh oh!

Uh oh!

Uh oh!

jaschrep-msft commented Mar 18, 2026

Uh oh!

LarryOsterman left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

jaschrep-msft commented Mar 13, 2026

Uh oh!

jaschrep-msft commented Mar 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

How do we handle this?

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

demoray commented Mar 17, 2026

Uh oh!

LarryOsterman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

LarryOsterman Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

LarryOsterman Mar 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jaschrep-msft commented Mar 18, 2026

Uh oh!

LarryOsterman left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

jaschrep-msft commented Mar 13, 2026 •

edited

Loading