Skip to content

Some updates #1

Open
Open
@jan-wassenberg

Description

@jan-wassenberg

Hi, thanks for farm_sve! I've hooked it up to Highway's tests and noticed some issues:

  • LD1RQ: is the predicate really for individual bits? svptrue doesn't set that many. It would make more sense if the predicate were for the elements, no? (Unfortunately the ARM docs do not clearly specify this.)
  • svunpklo_s/u behave like hi (also adding Size/2)
  • svunpkhi_b depends on the size of the type; should instead have one bit/byte per byte
  • svsplice: op2 index should start from zero
  • svbic: need ~ instead of !
  • svunpklo_u64: loop bound should be op.Size / 2
  • core_qsub: adds instead of sub; did not handle 0 - 1 correctly
  • svmaxv: use lowest instead of min
  • core_abs: add if (v1 == std::numeric_limits::min) return v1;
  • svmul_u/s16: cast to u/i32, u/s32 cast to u/i64
  • core_ceil_type_n: just set rounding mode, return nearbyint
  • svtbl: not bounds checked, overrun if > 128
  • svext: op2 index should start from 0
  • core_svrevw: std::swap(data.bt[idx], data.bt[sizeof(Type)/4 - idx - 1]);

These are implemented in patch.txt, which is applied to the clang-format output of wget https://gitlab.inria.fr/bramas/farm-sve/-/raw/master/farm_sve.h?inline=false.

Note that svunpkhi_b is a hack: we need it to expand a 16-bit mask to 32-bit, and this is hardcoded in the implementation. To exactly match SVE, I think we'd need to have one predicate bit per byte in the type, then we could have a type-agnostic unpklo/hi_b.

Please let me know if you disagree with any of the other changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions