sort: improve performance on large chunks #8652

kehrazy · 2025-09-17T07:52:32Z

Rationale

Previously, reading a record required growing the buffer until a separator was found. With extremely long records (e.g., multi‑GB lines), this could lead to excessive memory usage and stalled progress (sort does not finish for large one line file #8583) before any output was produced. This change keeps memory bounded while ensuring forward progress by spilling oversized records to temporary runs, aligning behavior with external sort implementations.

Overview of changes

Reader signaling and buffer capping
- Introduce ReadProgress with SentChunk | NeedSpill | NoChunk | Finished to make read outcomes explicit.
- grow_buffer now respects a hard cap.
- When the buffer hits the cap without a separator, read_to_buffer returns NeedSpill instead of growing further.
External sort read/write loop
- Maintain up to two in‑flight reads to keep the sorter fed.
- On NeedSpill, stream the current oversized record into a temporary run file (spill_long_record), appending a single separator to match write_lines semantics and pushing post‑separator remainder into carry_over.
- Preserve the existing ≤2‑chunk in‑memory fast path for small inputs.
- If reading produces exactly one temporary run and the run is uncompressed and --unique is not used, stream that run directly to the final output (avoids re‑reading giant records during merge).
Merge path adjustments
- Use a bounded buffer in the merge reader, similar to the external sort reader.
- If a spill signal is encountered during merge (rare), perform a one‑off unbounded read to finish that record and preserve correctness. This keeps the change focused while allowing a future streaming comparator to remove the fallback.

kehrazy · 2025-09-17T08:13:51Z

sylvestre · 2025-09-17T09:21:45Z

Many tasks are failing

BTW, could you run the benchmark without your change?
And please past the output directly. Screenshots aren't great :)

kehrazy · 2025-09-17T10:19:13Z

Many tasks are failing

This, I'm not sure how to fix - the tests are passing, linters are happy - the spellcheck complains about "memrchr", and i10n tests have just timed out. I haven't touched i10n by a large enough margin (I think?) - in what direction should I look at?

BTW, could you run the benchmark without your change? And please past the output directly. Screenshots aren't great :)

Sure thing!

We're getting a sample input using

dd if=/dev/zero bs=1M count=4096 status=progress | tr '\0' 'A' |
head -c 4294967295 > oneline_4G.txt && echo >> oneline_4G.txt

We can't run this exact benchmark using main (as of aaf742d), because the Rust version.. doesn't finish. So, after putting a reasonable timeout (and ignoring exit codes with -i):

hyperfine -i "timeout 15s ./target/release/coreutils sort oneline_4G.txt" "timeout 15s sort oneline_4G.txt" --export-markdown report.md

Command	Mean [s]	Min [s]	Max [s]	Relative
`timeout 15s ./target/release/coreutils sort oneline_4G.txt`	15.003 ± 0.001	15.002	15.005	6.85 ± 0.49
`timeout 15s sort oneline_4G.txt`	2.191 ± 0.158	1.757	2.317	1.00

..and with changes to readers:

Command	Mean [s]	Min [s]	Max [s]	Relative
`timeout 15s ./target/release/coreutils sort oneline_4G.txt`	2.808 ± 0.168	2.574	3.039	1.23 ± 0.08
`timeout 15s sort oneline_4G.txt`	2.275 ± 0.034	2.242	2.345	1.00

kimono-koans · 2025-09-17T16:47:27Z

When the buffer hits the cap without a separator, read_to_buffer returns NeedSpill instead of growing further.

This may be the most advantageous way to fix this problem, for now, but wouldn't it make more sense, in the future, to just store the range of file read (if it is a file, not stdin!) without a newline/separator, instead of writing to a new tmp file, with these extremely large buffers? Wouldn't that avoid writes and be more file cache friendly?

src/uu/sort/src/check.rs

src/uu/sort/src/ext_sort.rs

src/uu/sort/src/merge.rs

github-actions · 2025-09-18T08:11:28Z

GNU testsuite comparison:

GNU test failed: tests/chmod/usage. tests/chmod/usage is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/chroot/chroot-credentials. tests/chroot/chroot-credentials is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/csplit/csplit-suppress-matched. tests/csplit/csplit-suppress-matched is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/basic. tests/du/basic is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/exclude. tests/du/exclude is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/hard-link. tests/du/hard-link is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/inacc-dest. tests/du/inacc-dest is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/inodes. tests/du/inodes is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/threshold. tests/du/threshold is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/env/env. tests/env/env is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/ls/abmon-align. tests/ls/abmon-align is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/ls/hyperlink. tests/ls/hyperlink is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/misc/read-errors. tests/misc/read-errors is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/od/od-x8. tests/od/od-x8 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/r-2. tests/rm/r-2 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/rm3. tests/rm/rm3 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/shuf/shuf. tests/shuf/shuf is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/shuf/shuf-reservoir. tests/shuf/shuf-reservoir is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-NaN-infloop. tests/sort/sort-NaN-infloop is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-benchmark-random. tests/sort/sort-benchmark-random is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress. tests/sort/sort-compress is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress-hang. tests/sort/sort-compress-hang is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress-proc. tests/sort/sort-compress-proc is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-discrim. tests/sort/sort-discrim is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-files0-from. tests/sort/sort-files0-from is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-merge. tests/sort/sort-merge is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-rand. tests/sort/sort-rand is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-spinlock-abuse. tests/sort/sort-spinlock-abuse is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-stale-thread-mem. tests/sort/sort-stale-thread-mem is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-u-FMR. tests/sort/sort-u-FMR is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-unique. tests/sort/sort-unique is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-unique-segv. tests/sort/sort-unique-segv is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-version. tests/sort/sort-version is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/test/test-N. tests/test/test-N is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/misc/usage_vs_getopt (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/stdbuf (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

Qelxiros · 2025-09-20T19:14:40Z

You can fix the cspell errors by adding

// spell-checker:ignore memrchr

to chunks.rs

codspeed-hq · 2025-09-21T07:05:42Z

CodSpeed Performance Report

Merging #8652 will degrade performances by 3.43%

_{Comparing kehrazy:sort-fix (e1318fe) with main (0258583)}

Summary

❌ 2 regressions
✅ 83 untouched
⏩ 94 skipped¹

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

	Benchmark	`BASE`	`HEAD`	Change
❌	`du_balanced_tree[(5, 4, 10)]`	9.1 ms	9.3 ms	-2.09%
❌	`du_human_balanced_tree[(5, 4, 10)]`	10.1 ms	10.5 ms	-3.43%

94 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

github-actions · 2025-09-21T10:05:10Z

GNU testsuite comparison:

GNU test failed: tests/chmod/usage. tests/chmod/usage is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/chroot/chroot-credentials. tests/chroot/chroot-credentials is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/basic. tests/du/basic is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/exclude. tests/du/exclude is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/hard-link. tests/du/hard-link is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/inacc-dest. tests/du/inacc-dest is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/inodes. tests/du/inodes is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/threshold. tests/du/threshold is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/env/env. tests/env/env is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/ls/abmon-align. tests/ls/abmon-align is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/ls/hyperlink. tests/ls/hyperlink is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/misc/read-errors. tests/misc/read-errors is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/od/od-x8. tests/od/od-x8 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/r-2. tests/rm/r-2 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/rm3. tests/rm/rm3 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/shuf/shuf. tests/shuf/shuf is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/shuf/shuf-reservoir. tests/shuf/shuf-reservoir is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-NaN-infloop. tests/sort/sort-NaN-infloop is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-benchmark-random. tests/sort/sort-benchmark-random is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress. tests/sort/sort-compress is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress-hang. tests/sort/sort-compress-hang is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress-proc. tests/sort/sort-compress-proc is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-discrim. tests/sort/sort-discrim is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-files0-from. tests/sort/sort-files0-from is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-merge. tests/sort/sort-merge is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-rand. tests/sort/sort-rand is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-spinlock-abuse. tests/sort/sort-spinlock-abuse is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-stale-thread-mem. tests/sort/sort-stale-thread-mem is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-u-FMR. tests/sort/sort-u-FMR is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-unique. tests/sort/sort-unique is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-unique-segv. tests/sort/sort-unique-segv is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-version. tests/sort/sort-version is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/test/test-N. tests/test/test-N is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/misc/usage_vs_getopt (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Note: The gnu test tests/du/2g is now being skipped but was previously passing.

kehrazy · 2025-09-21T15:45:26Z

What can I do about

Error: No space left on device

in CI? I don't think I made any changes that would break CI in such a way?

sylvestre · 2025-09-21T15:52:07Z

let me rebase it

sylvestre · 2025-09-21T15:52:49Z

it is a huge patch, it is possible to make it smaller for review ?

github-actions · 2025-09-21T18:56:56Z

GNU testsuite comparison:

GNU test failed: tests/chmod/usage. tests/chmod/usage is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/chroot/chroot-credentials. tests/chroot/chroot-credentials is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/basic. tests/du/basic is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/exclude. tests/du/exclude is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/hard-link. tests/du/hard-link is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/inacc-dest. tests/du/inacc-dest is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/inodes. tests/du/inodes is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/threshold. tests/du/threshold is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/env/env. tests/env/env is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/ls/abmon-align. tests/ls/abmon-align is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/misc/read-errors. tests/misc/read-errors is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/od/od-x8. tests/od/od-x8 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/r-2. tests/rm/r-2 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/rm3. tests/rm/rm3 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/shuf/shuf. tests/shuf/shuf is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/shuf/shuf-reservoir. tests/shuf/shuf-reservoir is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-NaN-infloop. tests/sort/sort-NaN-infloop is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-benchmark-random. tests/sort/sort-benchmark-random is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress. tests/sort/sort-compress is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress-hang. tests/sort/sort-compress-hang is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress-proc. tests/sort/sort-compress-proc is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-discrim. tests/sort/sort-discrim is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-files0-from. tests/sort/sort-files0-from is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-merge. tests/sort/sort-merge is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-rand. tests/sort/sort-rand is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-spinlock-abuse. tests/sort/sort-spinlock-abuse is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-stale-thread-mem. tests/sort/sort-stale-thread-mem is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-u-FMR. tests/sort/sort-u-FMR is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-unique. tests/sort/sort-unique is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-unique-segv. tests/sort/sort-unique-segv is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-version. tests/sort/sort-version is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/test/test-N. tests/test/test-N is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/misc/usage_vs_getopt (fails in this run but passes in the 'main' branch)

kehrazy · 2025-09-21T21:30:14Z

it is a huge patch, it is possible to make it smaller for review ?

I mean, I can try? but, eh, we touch a lot of parts that were static before - do we really wanna split these out into multiple merges?

the scope of the MR may be reduced, sure (e.g. the reallocation stuff), though!

sylvestre · 2025-10-08T15:15:22Z

most of the jobs are failing, are you going to work on it? thanks

kehrazy · 2025-10-11T15:10:31Z

most of the jobs are failing, are you going to work on it? thanks

sure, i will

- Introduce ReadProgress with SentChunk | NeedSpill | NoChunk | Finished - Cap grow_buffer; read_to_buffer returns NeedSpill on cap without separator

- Maintain up to two in-flight reads - On NeedSpill, stream oversized record to temp run (spill_long_record) - Append single separator; push remainder into carry_over - Preserve ≤2-chunk in-memory fast path

- Use bounded buffer in merge reader - Fallback unbounded read when spill encountered to preserve correctness

(cherry picked from commit 72b70a9)

github-actions · 2025-10-17T13:59:13Z

GNU testsuite comparison:

GNU test failed: tests/chmod/usage. tests/chmod/usage is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/chroot/chroot-credentials. tests/chroot/chroot-credentials is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/basic. tests/du/basic is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/exclude. tests/du/exclude is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/hard-link. tests/du/hard-link is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/inacc-dest. tests/du/inacc-dest is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/inodes. tests/du/inodes is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/du/threshold. tests/du/threshold is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/env/env. tests/env/env is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/ls/hyperlink. tests/ls/hyperlink is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/misc/read-errors. tests/misc/read-errors is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/od/od-x8. tests/od/od-x8 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/r-2. tests/rm/r-2 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/rm/rm3. tests/rm/rm3 is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/shuf/shuf. tests/shuf/shuf is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/shuf/shuf-reservoir. tests/shuf/shuf-reservoir is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-NaN-infloop. tests/sort/sort-NaN-infloop is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-benchmark-random. tests/sort/sort-benchmark-random is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress. tests/sort/sort-compress is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress-hang. tests/sort/sort-compress-hang is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-compress-proc. tests/sort/sort-compress-proc is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-discrim. tests/sort/sort-discrim is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-files0-from. tests/sort/sort-files0-from is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-merge. tests/sort/sort-merge is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-rand. tests/sort/sort-rand is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-spinlock-abuse. tests/sort/sort-spinlock-abuse is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-stale-thread-mem. tests/sort/sort-stale-thread-mem is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-u-FMR. tests/sort/sort-u-FMR is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-unique. tests/sort/sort-unique is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-unique-segv. tests/sort/sort-unique-segv is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/sort/sort-version. tests/sort/sort-version is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/test/test-N. tests/test/test-N is passing on 'main'. Maybe you have to rebase?
Skip an intermittent issue tests/misc/usage_vs_getopt (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/rm/rm1 (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

kehrazy marked this pull request as ready for review September 17, 2025 07:52

kehrazy marked this pull request as draft September 17, 2025 07:57

kehrazy force-pushed the sort-fix branch 5 times, most recently from ff0b106 to 7ce1ffe Compare September 17, 2025 08:10

kehrazy marked this pull request as ready for review September 17, 2025 08:10

RGBCube reviewed Sep 17, 2025

View reviewed changes

kehrazy force-pushed the sort-fix branch from 7ce1ffe to 53b867a Compare September 17, 2025 18:53

kehrazy force-pushed the sort-fix branch 2 times, most recently from bd1d34a to b3c9ad9 Compare September 21, 2025 06:28

sylvestre force-pushed the sort-fix branch from b3c9ad9 to fcd3ac2 Compare September 21, 2025 15:52

jacob-greenfield mentioned this pull request Sep 26, 2025

sort: fix newline handling across large and/or multiple files #8746

Merged

sort: reader signaling and buffer capping

bba8099

- Introduce ReadProgress with SentChunk | NeedSpill | NoChunk | Finished - Cap grow_buffer; read_to_buffer returns NeedSpill on cap without separator

kehrazy added 3 commits October 17, 2025 14:37

sort: external sort read/write loop improvements

eada5fc

- Maintain up to two in-flight reads - On NeedSpill, stream oversized record to temp run (spill_long_record) - Append single separator; push remainder into carry_over - Preserve ≤2-chunk in-memory fast path

sort: merge path adjustments

8ecc358

- Use bounded buffer in merge reader - Fallback unbounded read when spill encountered to preserve correctness

sort: update check path for bounded reader

b6ac563

kehrazy force-pushed the sort-fix branch from fcd3ac2 to b6ac563 Compare October 17, 2025 11:38

sort: do not search previously-read characters for newlines

e1318fe

(cherry picked from commit 72b70a9)

Uh oh!

sort: improve performance on large chunks #8652

Are you sure you want to change the base?

sort: improve performance on large chunks #8652

Conversation

kehrazy commented Sep 17, 2025

Uh oh!

kehrazy commented Sep 17, 2025

Uh oh!

sylvestre commented Sep 17, 2025

Uh oh!

kehrazy commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kimono-koans commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Sep 18, 2025

Uh oh!

Qelxiros commented Sep 20, 2025

Uh oh!

codspeed-hq bot commented Sep 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #8652 will degrade performances by 3.43%

Summary

Benchmarks breakdown

Footnotes

Uh oh!

github-actions bot commented Sep 21, 2025

Uh oh!

kehrazy commented Sep 21, 2025

Uh oh!

sylvestre commented Sep 21, 2025

Uh oh!

sylvestre commented Sep 21, 2025

Uh oh!

github-actions bot commented Sep 21, 2025

Uh oh!

kehrazy commented Sep 21, 2025

Uh oh!

sylvestre commented Oct 8, 2025

Uh oh!

kehrazy commented Oct 11, 2025

Uh oh!

github-actions bot commented Oct 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

kehrazy commented Sep 17, 2025 •

edited

Loading

kimono-koans commented Sep 17, 2025 •

edited

Loading

codspeed-hq bot commented Sep 21, 2025 •

edited

Loading