Skip to content

Conversation

@overlookmotel
Copy link
Member

@overlookmotel overlookmotel commented Jan 15, 2026

#17639 made a nice optimization to the overlong_source test in parser, which massively speeded it up.

Unfortunately, it is unsound. It creates a &str from a range of memory which is not allocated. Creating a &str which contains uninitialized bytes is immediate Undefined Behavior, regardless of how the &str is used afterwards.

Less academically, if any of the data of the string was accessed, it could try to read from an unallocated memory page, which would be a segfault. That could make this test flaky. The UB could also cause other tests to fail randomly (or pass when they should fail), depending on how compiler exploits the UB.

This PR takes a different approach to speeding up this test. Make an allocation of MAX_SIZE + 1 bytes (4 GiB), but use alloc_zeroed to do it. On most platforms this doesn't actually write zeros across 4 GiB of memory, but just sets all the pages of the allocation to "zero pages" in the page table - which is much faster.

This is a bit slower than the unsound approach but in my view still fast enough for our purposes.

Before this PR:

> time cargo test -p oxc_parser --all-features
...
test result: ok. 54 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
cargo test -p oxc_parser --all-features  0.07s user 0.03s system 98% cpu 0.105 total

After this PR:

> time cargo test -p oxc_parser --all-features
...
test result: ok. 54 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.31s
cargo test -p oxc_parser --all-features  0.13s user 0.23s system 87% cpu 0.413 total

(measured on a Macbook Air M3)

In my view it's worth 300ms to avoid UB.

Copy link
Member Author


How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

  • 0-merge - adds this PR to the back of the merge queue
  • hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@github-actions github-actions bot added A-parser Area - Parser C-test Category - Testing. Code is missing test cases, or a PR is adding them labels Jan 15, 2026
@overlookmotel overlookmotel marked this pull request as ready for review January 15, 2026 14:56
Copilot AI review requested due to automatic review settings January 15, 2026 14:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes undefined behavior (UB) in the overlong_source test that was introduced in #17639. The previous optimization created a fake &str pointing to unallocated memory, which is immediate UB. This PR replaces it with a sound implementation that uses alloc_zeroed to create a 4 GiB allocation efficiently via zero-page mapping.

Changes:

  • Replaced the unsafe fake string construction with a proper ZeroedString type
  • Uses alloc_zeroed with page-aligned layout to efficiently allocate 4 GiB of zeroed memory
  • Implements proper Drop semantics to deallocate the memory

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codspeed-hq
Copy link

codspeed-hq bot commented Jan 15, 2026

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing om/01-15-test_parser_fix_ub_in_test_for_overlong_source (fcdd22f) with main (8ee6f80)

Summary

✅ 42 untouched benchmarks
⏩ 3 skipped benchmarks1

Footnotes

  1. 3 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@overlookmotel
Copy link
Member Author

@spanishpear Thank you for spotting that this test was so slow and for fixing it. Sorry to go over your work. I'm a bit of a stickler for soundness!

Can you possibly help me with something? In description on #17639, you said that this test was taking 7 secs. I was unable to replicate that on my machine - all the parser tests ran in about 700ms on my MacBook Air M3 prior to your PR, 100ms after. So I'm not sure what machine you were seeing the very slow performance on. Can you please advise?

If you have time, could you possibly try this branch on that same machine and make sure it's not putting the test back up to 7 secs?

If it does, I think we might have to just remove the test entirely. We can't have slow tests, but we should also avoid UB.

@overlookmotel overlookmotel self-assigned this Jan 15, 2026
@spanishpear
Copy link
Contributor

Creating a &str which contains uninitialized bytes is immediate Undefined Behavior, regardless of how the &str is used afterwards.

@overlookmotel Thanks for spotting this!! I thought it wasn't UB as long as the underlying bytes weren't addressed, but doing a bit of reading clarified that mistake for me 🙏

If you have time, could you possibly try this branch on that same machine and make sure it's not putting the test back up to 7 secs?

I'll take a look! It was on a WSL Ubuntu machine, with much lower specs than an M3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-parser Area - Parser C-test Category - Testing. Code is missing test cases, or a PR is adding them

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants