Skip to content
Michael R. Crusoe edited this page Feb 1, 2025 · 40 revisions

Here we draft the release notes for the next release.

Note: format is [summary] [commit hash or PR#] [author(s)]

Use the release notes helper script to generate the preliminary list. Then group the changes and review the descriptions and look out for ????

Mostly the first line of the commit line is a good summary, but please think through each entry and (re)write a summary that helps users quickly determine if this change would be interesting/useful to them. For example, include the name of the intrinsic/function in the summary so that users don't have to click through each commit themselves.

SIMDe 0.8.4

Summary

Details

NEON

  • avoid warnings when "__ARM_NEON_FP" is not defined. f046ab7 @clopez
  • vminnmv_f16: remove duplicate statement (#1208) d1d9f82 @mr-c
  • crc32: define SIMDE_ARCH_ARM_CRC32 and consistently use it 01470d2 @mr-c

WASM intrinsics

x86 intrinsics

  • fma: Use 128 bit fnmadd_pd to do 256 bit fnmadd_pd (#1197) bd05320 @AlexK-BD
  • avx: _mm256_storeu_pd and _mm256_loadu_pd using 128 bit lanes 96054b8 @AlexK-BD
  • sse4.2: Apply half tabular method in _mm_crc32 family for the best trade-off between performance and lookup table size 0f68b62 @Cuda-Chen
  • sse2: move definition of 'value' to correct branch in simde_mm_loadl_epi64 b8e468a @K-os
  • some better implementations for MSVC and others without SIMDE_STATEMENT_EXPR_ 1691ae0 @mr-c

SVML

XOP

Arch support

Altivec

  • wasm: add u16x8 and u8x16 avgr AltiVec optimized implementations f9bf637 @wrv

arm / arm64

  • wasm: add u16x8 and u8x16 avgr NEON optimized implementations 7e65734 @wrv

LongAarch

  • x86/sse: Fix type convert error for LSX. a6d4207 @yinshiyou
  • float16: use a portable version to avoid compilation errors 600050d @XiWeiGu

RISCV64

  • arm: improve performance in vqadd and vmvn in risc-v 17416b1 @zengdage
  • arm/neon: additional RVV implementations (43 instructions) - part 1 (#1188) 6346405 @Ruhung
  • arm/neon: additional RVV implementations (34 instructions) - part 2. (#1189) c903416 @wewe5215

WASM

  • arm neon st2: add vst2_u8 WASM optimized implementation 9aeb89e @wrv
  • arm neon shll_n: add vshll WASM optimized implementations 1fdca85 @wrv
  • arm neon st4: add vst4_u8 WASM optimized implementation 7f47244 @wrv
  • sse2: remove redundant mm_add_pd optimized implementation for WASM (#1190) 8ee42f6 @wrv
  • sse2: Wasm SIMD version of _mm_sad_epu8 bc37d4b @wrv

z/Arch

  • neon/cvz: stop using deprecated functions. 776d0b6 @mr-c

Compiler Specific

Clang

  • Don't use _Float16 on s390x a1ce45c @mcatanzaro
  • Don't use _Float16 on non-SSE2 x86 40f4d28 @mcatanzaro

GCC

  • Use _Float16 in C++ on aarch64 with GCC 13+ e30e6ec @mcatanzaro
  • arm neon: fix arm64 gcc11 build excess elements in vector failure d370f28 @Qingwu-Li
  • arm neon: avoid vst1_*_x4 built-in functions in GCC 11 and before 557fd6d @Qingwu-Li
  • arm neon sm3: gcc-14 -O3 complained about some possible uninitialized values 99ac62b @mr-c

Emscripten

MSVC

  • add simde_MemoryBarrier to avoid including <windows.h> f47e3c5 @Epixu

Testing with Docker/Podman & CI

  • meson: 0.55.1 is needed for Python 3.12+ 030c07c @mr-c
  • stop testing with MSVC 2022 until they fix their regressions b6ea9ba @mr-c
  • switch container for gcc11 i686 -O2 test 56b7c7a @mr-c
  • GitHub has retired the macos-11 runners, add some more -13 (x86-64) and -14 (arm64) testing 32c959c @mr-c
  • ensure that gcov is present when needed 6f52a1d @mr-c
  • upgrade to Ubuntu 24.04 LTS; upgrade/add GCC 13 / clang 18 d67c190 @mr-c
  • test loongson + lsx with gcc14 from Ubuntu Oracular 59bf8de @mr-c
  • add CI testing for gcc 11 aarch64/arm64 4b96738 @mr-c
  • upgrade gcc-qemu to gcc-14 561556c @mr-c
  • test aarch64 without extra features 6686232 @mr-c
  • add loongarch64 clang-18 test ac3870b @mr-c
  • clean up install list 9cbeced @mr-c
  • pin emsdk to earlier version until https://github.com/llvm/llvm-project/issues/117200 is fixed and released 3257054 @mr-c
  • upgrade Ubuntu Mantic to Ubuntu Noble (24.04) e1bc420 @mr-c
  • macos: xcode 14.3.1 is no longer available, switch to macos-15 to test xcode 16.0 7035777 @mr-c
  • msvc-arm64: turn off due to compiler issue 6802efa @mr-c
  • macos 12: deprecated, going offline on 2024-12-03 2bb7f48 @mr-c

Misc

  • pow: consistently use simde_math_pow 8f727c0 @mr-c
  • math: typo fix, check SIMDE_MATH_NANF instead of the old-style SIMDE_NANF 40567df @mr-c
  • math: Whoops, missing comma 73e43dd @Dave-Lowndes
Template for next time

# Summary
## [X86](https://github.com/simd-everywhere/implementation-status/blob/main/x86.md)
### Newly added function families
### Additions to existing families
## [Neon](https://github.com/simd-everywhere/implementation-status/blob/main/neon.md)
## [MSA](https://github.com/simd-everywhere/implementation-status/blob/main/msa.md)
# Details
## Implementation of Arm intrinsics
### NEON
### SVE Intrinsics
## WASM intrinsics
## x86 intrinsics
### SSE*
### AVX
### AVX2
### AVX512
### GFNI 
### XOP
### F16C
### FMA
### SVML
## MIPS MSA intrinics
## Arch support
### arm64
### z/Arch
### AltiVec
### e2k (Elbrus)
### Power
## Testing with Docker/Podman & CI
### [Appveyor](https://ci.appveyor.com/project/nemequ/simde/history)
### [Azure](https://dev.azure.com/simd-everywhere/SIMDe/_build?definitionId=3)
### [Circle CI](https://app.circleci.com/pipelines/github/simd-everywhere/simde)
### [Cirrus CI](https://cirrus-ci.com/github/simd-everywhere/simde)
### [Local testing with Docker/Podman](https://github.com/simd-everywhere/simde/tree/master/docker#readme)
### [Drone.io](https://cloud.drone.io/simd-everywhere/simde)
### [GitHub Actions](https://github.com/simd-everywhere/simde/actions)
### [Netlify](https://app.netlify.com/sites/simde/)
### [Packit CI](https://dashboard.packit.dev/projects/github.com/simd-everywhere/simde)
### [Semaphore CI](https://nemequ.semaphoreci.com/projects/simde)
### [Travis](https://app.travis-ci.com/github/simd-everywhere/simde)
## Misc
Clone this wiki locally