Skip to content

Add max_read_window_size config to prefetcher benchmarks#1566

Open
mansi153 wants to merge 2 commits intoawslabs:mainfrom
mansi153:bench_prefetch_window
Open

Add max_read_window_size config to prefetcher benchmarks#1566
mansi153 wants to merge 2 commits intoawslabs:mainfrom
mansi153:bench_prefetch_window

Conversation

@mansi153
Copy link
Contributor

@mansi153 mansi153 commented Aug 8, 2025

Add max_read_window_size config to prefetcher benchmarks so we can set the backpressure window max value at runtime.

The PR also fixes a few minor things - throughput calculation units from Gib/s to Gb/s, consistency of config names and improving some benchmarking defaults.

Does this change impact existing behavior?

No, benchmarks change only.

Does this change need a changelog entry? Does it require a version change?

No, benchmarks change only.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
@mansi153 mansi153 had a problem deploying to PR integration tests August 8, 2025 00:54 — with GitHub Actions Failure
long,
help = "Initial read window size in bytes, used to dictate how far ahead we request data from S3",
default_value = "0"
default_value = "2147483648"
Copy link
Contributor

@sahityadg sahityadg Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is this is a fixed read window size in client benchmarks. So should we rename the parameter to read_window_size?


client_backpressure:
read_window_size: !!null #2147483648
max_read_window_size: !!null #2147483648
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a constant read window size or can it go up to this value?

prefetch_env["EXPERIMENTAL_MOUNTPOINT_NO_DOWNLOAD_INTEGRITY_VALIDATION"] = "ON"

if self.prefetch_config['max_read_window_size'] is not None:
prefetch_env["UNSTABLE_MOUNTPOINT_MAX_PREFETCH_WINDOW_SIZE"] = str(self.prefetch_config['max_read_window_size'])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For prefetcher, we could configure initial and max window size, do we want to configure both? Would it be a like for like comparision with client backpressure benchmark if we just configure max window size?

long,
help = "Initial read window size in bytes, used to dictate how far ahead we request data from S3",
default_value = "0"
default_value = "2147483648"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we also document here somehow that this number corresponds to 2GiB?

"{iteration}: received {received_size} bytes in {:.2}s: {:.2} Gb/s",
elapsed.as_secs_f64(),
(received_size as f64) / elapsed.as_secs_f64() / (1024 * 1024 * 1024 / 8) as f64
(received_size as f64) / elapsed.as_secs_f64() / (1000 * 1000 * 1000 / 8) as f64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we make this value more readable? not sure if best to use a clearly named variable or a function to reuse in line 251 too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants