-
-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
As suggest by `` on reddit, compute_delta
as well as the integration on AVX2 would be more efficient if it was relying on the `_mm256_alignr_epi8` instruction.
Copy pasting the incomplete snippet from reddit.
fn compute_delta(curr: DataType, prev: DataType) -> DataType {
unsafe {
let shifted = _mm256_alignr_epi8(curr, _mm256_permute2x128_i256(prev, curr, 0x21), 12);
_mm256_sub_epi32(curr, shifted)
}
}
Metadata
Metadata
Assignees
Labels
No labels