Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fn unaligned_lvl_slice: Safely encapsulate "unaligned" slices of lvl: &[[u8; 4]], as previous accesses were UB #731

Merged
merged 3 commits into from
Feb 9, 2024

Conversation

kkysen
Copy link
Collaborator

@kkysen kkysen commented Feb 6, 2024

See:

The current "unaligned" access of lvl, added in #726, and a proposed change in #728, are UB according to miri, as they cast &[u8; 3] and &u8 to &[u8; 4], which is an out of bounds read of a borrow. This fn unaligned_lvl_slice helper fn fixes it and passes miri by transmuting (through .align_to) to a &[u8], doing the slice, and then doing a checked cast back to [u8; 4]. It also does correct bounds checking, as the previous ways were off by one in their bounds checking, as they didn't consider the read of the unaligned part into the next [u8; 4]. Doing it this way also optimizes just as well as the previous methods.

We could also use a dependency like zerocopy to encapsulate the unsafe .align_to, but as long as these kinds of "unaligned" array reads are a one-off, doing it inline here seems better.

Update: Switched to using unstable fns from std that make things fully safe and correct, copied into our codebase for stability.
I already figured out how to fix the UB in reviewing #728, so I just went ahead and made the fix independently.

@kkysen kkysen force-pushed the kkysen/fix-ub-unaligned_lvl branch from 8e3ae3e to dc02d18 Compare February 6, 2024 07:47
@kkysen kkysen changed the title fn unaligned_lvl: Safely encapsulate "unaligned" reads of lvl: &[[u8; 4]], as previous accesses were UB fn unaligned_lvl_slice: Safely encapsulate "unaligned" slices of lvl: &[[u8; 4]], as previous accesses were UB Feb 6, 2024
src/lf_apply.rs Show resolved Hide resolved
src/lf_apply.rs Outdated Show resolved Hide resolved
@fbossen
Copy link
Collaborator

fbossen commented Feb 6, 2024

The doc for align_to says:

This method splits the slice into three distinct slices: prefix, correctly aligned middle slice of a new type, and the suffix slice. How exactly the slice is split up is not specified; the middle part may be smaller than necessary. However, if this fails to return a maximal middle part, that is because code is running in a context where performance does not matter, such as a sanitizer attempting to find alignment bugs. Regular code running in a default (debug or release) execution will return a maximal middle part.

Is the sanitizer behaviour something we need to worry about?

src/lf_apply.rs Outdated Show resolved Hide resolved
@@ -377,7 +394,7 @@ unsafe fn filter_plane_cols_y<BD: BitDepth>(
dst.offset((x * 4) as isize).cast(),
ls,
hmask.as_mut_ptr(),
lvl[x as usize][0..].as_ptr() as *const [u8; 4],
&lvl[x as usize],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If enforcing a minimum slice size, this would be something like lvl[x as usize..][..((endy4 - starty4 - 1) as isize * b4_stride + 1) as usize].as_ptr(). However, index -1 of such as slice can also be read, so more work on slice definitions will be needed. But that's probably outside the scope of this PR

Copy link
Collaborator Author

@kkysen kkysen Feb 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping to structure things as, at this point in the code, we don't care about checking the length, as it is up to the inner fn to correctly bounds check when reading elements. That's of course not true for the asm fns, though it will be eventually true (once we make them safe) for the C/Rust fallbacks. If we do check the length eagerly here, the only real safety that gives us in documentation, that is, if we check that the asm fn claims it will access X elements and here we check for X elements, we check that the actual length matches the docs, but it doesn't enforce anything on the asm. Not sure if that made sense.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense. If, down the road, we want the C/Rust fallbacks to check bounds, then we'll probably have to change fn interfaces to avoid casting to raw pointers and recreating slices in the fallback fns? One potential wrinkle here, is that an asm fn may access memory locations that are different from the corresponding fallback fn (because asm may read a 128-bit word where the fallback reads a couple 16-bit words; I can't immediately think of a case where that would also apply a write). I don't know whether we need to consider that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't change the existing fn ptr args without modifying the asm, which we don't want to do. But what we can do is is pass extra args that asm ignores (which we already do for passing bitdepth_max: c_int for 8bpc fns). So we can split a slice into the ptr and len, pass the len as an extra arg, and then recreate the slice in the fallback C/Rust fn. Where it will be technically unsafe, but quite safe in practice.

As for asm fns accessing more memory locations than the fallbacks, we could eagerly checks the bounds, but it's still ultimately up to the asm not to access any more than that. It's always going to be inherently very unsafe.

@kkysen
Copy link
Collaborator Author

kkysen commented Feb 8, 2024

The doc for align_to says:

This method splits the slice into three distinct slices: prefix, correctly aligned middle slice of a new type, and the suffix slice. How exactly the slice is split up is not specified; the middle part may be smaller than necessary. However, if this fails to return a maximal middle part, that is because code is running in a context where performance does not matter, such as a sanitizer attempting to find alignment bugs. Regular code running in a default (debug or release) execution will return a maximal middle part.

Is the sanitizer behaviour something we need to worry about?

Hmm, that's a bit worrying. I didn't realize that would ever be done. Let me think about it, and maybe one of those newer nightly methods can help.

…gn_to` impl with the unstable `.flatten` and `.as_chunks` ported from `std`.
Copy link
Collaborator

@fbossen fbossen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@folkertdev folkertdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes changes look good!

should the discussion/conclusion around checking the slice length be documented somewhere?

I'm not too hopeful on a quick stabilization of these methods but the benefit of stabilization is mostly aesthetics at this point.

@kkysen
Copy link
Collaborator Author

kkysen commented Feb 9, 2024

should the discussion/conclusion around checking the slice length be documented somewhere?

Hmm, perhaps in an issue? It's a concern for all of the fn ptr boundaries, so putting a comment at one of the call-sites doesn't make as much sense.

@kkysen kkysen merged commit b796035 into main Feb 9, 2024
34 checks passed
@kkysen kkysen deleted the kkysen/fix-ub-unaligned_lvl branch February 9, 2024 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants