Skip to content

Prefetch Benchmark: Add simple random read#1747

Open
yerzhan7 wants to merge 6 commits intoawslabs:mainfrom
yerzhan7:prefetch-bench/add-randread
Open

Prefetch Benchmark: Add simple random read#1747
yerzhan7 wants to merge 6 commits intoawslabs:mainfrom
yerzhan7:prefetch-bench/add-randread

Conversation

@yerzhan7
Copy link
Contributor

@yerzhan7 yerzhan7 commented Jan 15, 2026

What changed and why?

Need to extend prefetcher benchmark to test random read access pattern in addition to existing sequential read.

Manual Testing

Successfully run benchmark: uv run benchmark.py -- benchmark_type=prefetch

Results Summary:
+-----------------------+------------------+--------------------------------------+---------+-----------------+------------------+--------------+--------------+
|   application_workers | benchmark_type   | benchmarks.prefetch.access_pattern   |   Count |   Median (Gbps) |   Std Dev (Gbps) |   Min (Gbps) |   Max (Gbps) |
+=======================+==================+======================================+=========+=================+==================+==============+==============+
|                     1 | prefetch         | random                               |      10 |            0.02 |             0    |         0.02 |         0.02 |
+-----------------------+------------------+--------------------------------------+---------+-----------------+------------------+--------------+--------------+
|                     1 | prefetch         | sequential                           |      10 |            8.38 |             0.25 |         8.16 |         8.85 |
+-----------------------+------------------+--------------------------------------+---------+-----------------+------------------+--------------+--------------+
|                    16 | prefetch         | random                               |      10 |            0.3  |             0.03 |         0.23 |         0.33 |
+-----------------------+------------------+--------------------------------------+---------+-----------------+------------------+--------------+--------------+
|                    16 | prefetch         | sequential                           |      10 |            4.1  |             0.6  |         3.01 |         4.82 |
+-----------------------+------------------+--------------------------------------+---------+-----------------+------------------+--------------+--------------+

Does this change impact existing behavior?

No. Changing Benchmark scripts only.

Does this change need a changelog entry? Does it require a version change?

No. Changing Benchmark scripts only.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

@yerzhan7 yerzhan7 had a problem deploying to PR integration tests January 15, 2026 14:19 — with GitHub Actions Failure
Signed-off-by: Yerzhan Mazhkenov <[email protected]>
@yerzhan7 yerzhan7 had a problem deploying to PR integration tests January 15, 2026 14:52 — with GitHub Actions Failure
Signed-off-by: Yerzhan Mazhkenov <[email protected]>
@yerzhan7 yerzhan7 temporarily deployed to PR integration tests January 15, 2026 14:54 — with GitHub Actions Inactive
default_value_t = 1,
value_name = "SEED"
)]
randseed: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should define the random behavior in more details:

  • Do we apply the same random access sequence to all objects and all iterations? (I'd say no)
  • Should the sequence(s) be reproducible across platforms?
  • What's the behavior when no seed is specified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we apply the same random access sequence to all objects and all iterations? (I'd say no)

Good point. Will add object ID and iteration number to the seed.

Should the sequence(s) be reproducible across platforms?

Will switch to PCG64 which should be portable and deterministic.

What's the behavior when no seed is specified?

Currently, the default value for seed is 1. I.e. still deterministic.

AccessPattern::Random => {
let mut rng = StdRng::seed_from_u64(randseed);
let mut total_bytes_read = 0;
let max_offset = size.saturating_sub(read_size);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why restrict the offset? vs allowing smaller reads at the end of the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix, thanks

let mut total_bytes_read = 0;
let max_offset = size.saturating_sub(read_size);

while Instant::now() < timeout {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are making the random read mode time-based, but using the timeout value. That's not the original intent of the timeout parameter and does not work well with the iterations parameter. We need to either define a separate parameter, or consider a different approach (e.g. no. of reads or total size).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added total_bytes_read < size in addition to timeout.

Ok(total_bytes_read)
}
AccessPattern::Random => {
let mut rng = StdRng::seed_from_u64(randseed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StdRng is not portable, I think. We probably want to use a different RNG: https://rust-random.github.io/book/guide-rngs.html. See also other comment about the random seed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will switch to PCG64, thanks!

@yerzhan7 yerzhan7 temporarily deployed to PR integration tests January 19, 2026 19:43 — with GitHub Actions Inactive
Signed-off-by: Yerzhan Mazhkenov <[email protected]>
@yerzhan7 yerzhan7 temporarily deployed to PR integration tests January 19, 2026 19:54 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants