Add Arm SVE 8-wide (256b) implementation #480

solidpixel · 2024-07-02T13:02:28Z

This PR adds support for Arm SVE via a fixed-width 256-bit implementation, as well as extending the fixed-width 128-bit implementation (which is mostly NEON) with a few targeted SVE operations such as native gathers.

Due to the style of 128-bit accumulator we use for floating-point invariance, it's not really possible to write a true vector-length-agnostic SVE implementation, so this implementation is a compile-time choice that will only work on 256b SVE implementations.

On a Neoverse V1 this code is ~30% faster than the equivalent NEON build.

solidpixel · 2024-07-02T14:32:22Z

This should check SVE width at runtime and error. As with other ISA checks that is done in the CLI front-end, and not in the library.

solidpixel · 2024-07-02T15:29:59Z

SVE runtime checks now added.

Source/astcenc_vecmathlib.h

Source/astcenc_vecmathlib_neon_4.h

Source/astcenccli_entry.cpp

solidpixel · 2024-08-02T12:23:44Z

Confirmed passes all tests on V1 (as we can't test SVE using GH actions).

solidpixel added enhancement performance labels Jul 2, 2024

solidpixel added this to the 4.9.0 milestone Jul 2, 2024

solidpixel self-assigned this Jul 2, 2024

solidpixel force-pushed the sve_support branch 4 times, most recently from f1f6c04 to b9216ca Compare August 1, 2024 12:10

bengaineyarm reviewed Aug 1, 2024

View reviewed changes

Source/astcenc_vecmathlib.h Outdated Show resolved Hide resolved

Source/astcenc_vecmathlib_neon_4.h Outdated Show resolved Hide resolved

Source/astcenccli_entry.cpp Outdated Show resolved Hide resolved

solidpixel added 4 commits August 2, 2024 11:57

Add Arm fixed-width 256b SVE vector support.

5ac61a8

Whitespace fix

7b0aef6

Fix whitespace

8a9b586

Address review comments

8079df1

solidpixel force-pushed the sve_support branch from b9216ca to 8079df1 Compare August 2, 2024 11:02

bengaineyarm approved these changes Aug 5, 2024

View reviewed changes

solidpixel merged commit 213d6c2 into main Aug 5, 2024
4 checks passed

solidpixel deleted the sve_support branch August 5, 2024 09:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Arm SVE 8-wide (256b) implementation #480

Add Arm SVE 8-wide (256b) implementation #480

solidpixel commented Jul 2, 2024 •

edited

Loading

solidpixel commented Jul 2, 2024

solidpixel commented Jul 2, 2024

solidpixel commented Aug 2, 2024

Add Arm SVE 8-wide (256b) implementation #480

Add Arm SVE 8-wide (256b) implementation #480

Conversation

solidpixel commented Jul 2, 2024 • edited Loading

solidpixel commented Jul 2, 2024

solidpixel commented Jul 2, 2024

solidpixel commented Aug 2, 2024

solidpixel commented Jul 2, 2024 •

edited

Loading