-
Notifications
You must be signed in to change notification settings - Fork 269
Release Notes
Michael R. Crusoe edited this page Feb 1, 2025
·
40 revisions
Here we draft the release notes for the next release.
Note: format is [summary] [commit hash or PR#] [author(s)]
Use the release notes helper script
to generate the preliminary list. Then group the changes and review the descriptions and look out for ????
Mostly the first line of the commit line is a good summary, but please think through each entry and (re)write a summary that helps users quickly determine if this change would be interesting/useful to them. For example, include the name of the intrinsic/function in the summary so that users don't have to click through each commit themselves.
- avoid warnings when "__ARM_NEON_FP" is not defined. f046ab7 @clopez
- vminnmv_f16: remove duplicate statement (#1208) d1d9f82 @mr-c
- crc32: define
SIMDE_ARCH_ARM_CRC32
and consistently use it 01470d2 @mr-c
- fma: Use 128 bit fnmadd_pd to do 256 bit fnmadd_pd (#1197) bd05320 @AlexK-BD
- avx:
_mm256_storeu_pd
and_mm256_loadu_pd
using 128 bit lanes 96054b8 @AlexK-BD - sse4.2: Apply half tabular method in
_mm_crc32
family for the best trade-off between performance and lookup table size 0f68b62 @Cuda-Chen - sse2: move definition of 'value' to correct branch in
simde_mm_loadl_epi64
b8e468a @K-os - some better implementations for MSVC and others without
SIMDE_STATEMENT_EXPR_
1691ae0 @mr-c
- wasm: add u16x8 and u8x16 avgr AltiVec optimized implementations f9bf637 @wrv
- wasm: add u16x8 and u8x16 avgr NEON optimized implementations 7e65734 @wrv
- x86/sse: Fix type convert error for LSX. a6d4207 @yinshiyou
- float16: use a portable version to avoid compilation errors 600050d @XiWeiGu
- arm: improve performance in vqadd and vmvn in risc-v 17416b1 @zengdage
- arm/neon: additional RVV implementations (43 instructions) - part 1 (#1188) 6346405 @Ruhung
- arm/neon: additional RVV implementations (34 instructions) - part 2. (#1189) c903416 @wewe5215
- arm neon st2: add vst2_u8 WASM optimized implementation 9aeb89e @wrv
- arm neon shll_n: add vshll WASM optimized implementations 1fdca85 @wrv
- arm neon st4: add vst4_u8 WASM optimized implementation 7f47244 @wrv
- sse2: remove redundant
mm_add_pd
optimized implementation for WASM (#1190) 8ee42f6 @wrv - sse2: Wasm SIMD version of
_mm_sad_epu8
bc37d4b @wrv
- neon/cvz: stop using deprecated functions. 776d0b6 @mr-c
- Don't use
_Float16
on s390x a1ce45c @mcatanzaro - Don't use
_Float16
on non-SSE2 x86 40f4d28 @mcatanzaro
- Use
_Float16
in C++ on aarch64 with GCC 13+ e30e6ec @mcatanzaro - arm neon: fix arm64 gcc11 build excess elements in vector failure d370f28 @Qingwu-Li
- arm neon: avoid vst1_*_x4 built-in functions in GCC 11 and before 557fd6d @Qingwu-Li
- arm neon sm3: gcc-14 -O3 complained about some possible uninitialized values 99ac62b @mr-c
- add
simde_MemoryBarrier
to avoid including<windows.h>
f47e3c5 @Epixu
- meson: 0.55.1 is needed for Python 3.12+ 030c07c @mr-c
- stop testing with MSVC 2022 until they fix their regressions b6ea9ba @mr-c
- switch container for gcc11 i686 -O2 test 56b7c7a @mr-c
- GitHub has retired the macos-11 runners, add some more -13 (x86-64) and -14 (arm64) testing 32c959c @mr-c
- ensure that gcov is present when needed 6f52a1d @mr-c
- upgrade to Ubuntu 24.04 LTS; upgrade/add GCC 13 / clang 18 d67c190 @mr-c
- test loongson + lsx with gcc14 from Ubuntu Oracular 59bf8de @mr-c
- add CI testing for gcc 11 aarch64/arm64 4b96738 @mr-c
- upgrade gcc-qemu to gcc-14 561556c @mr-c
- test aarch64 without extra features 6686232 @mr-c
- add loongarch64 clang-18 test ac3870b @mr-c
- clean up install list 9cbeced @mr-c
- pin emsdk to earlier version until https://github.com/llvm/llvm-project/issues/117200 is fixed and released 3257054 @mr-c
- upgrade Ubuntu Mantic to Ubuntu Noble (24.04) e1bc420 @mr-c
- macos: xcode 14.3.1 is no longer available, switch to macos-15 to test xcode 16.0 7035777 @mr-c
- msvc-arm64: turn off due to compiler issue 6802efa @mr-c
- macos 12: deprecated, going offline on 2024-12-03 2bb7f48 @mr-c
- pow: consistently use simde_math_pow 8f727c0 @mr-c
- math: typo fix, check
SIMDE_MATH_NANF
instead of the old-styleSIMDE_NANF
40567df @mr-c - math: Whoops, missing comma 73e43dd @Dave-Lowndes
Template for next time
# Summary
## [X86](https://github.com/simd-everywhere/implementation-status/blob/main/x86.md)
### Newly added function families
### Additions to existing families
## [Neon](https://github.com/simd-everywhere/implementation-status/blob/main/neon.md)
## [MSA](https://github.com/simd-everywhere/implementation-status/blob/main/msa.md)
# Details
## Implementation of Arm intrinsics
### NEON
### SVE Intrinsics
## WASM intrinsics
## x86 intrinsics
### SSE*
### AVX
### AVX2
### AVX512
### GFNI
### XOP
### F16C
### FMA
### SVML
## MIPS MSA intrinics
## Arch support
### arm64
### z/Arch
### AltiVec
### e2k (Elbrus)
### Power
## Testing with Docker/Podman & CI
### [Appveyor](https://ci.appveyor.com/project/nemequ/simde/history)
### [Azure](https://dev.azure.com/simd-everywhere/SIMDe/_build?definitionId=3)
### [Circle CI](https://app.circleci.com/pipelines/github/simd-everywhere/simde)
### [Cirrus CI](https://cirrus-ci.com/github/simd-everywhere/simde)
### [Local testing with Docker/Podman](https://github.com/simd-everywhere/simde/tree/master/docker#readme)
### [Drone.io](https://cloud.drone.io/simd-everywhere/simde)
### [GitHub Actions](https://github.com/simd-everywhere/simde/actions)
### [Netlify](https://app.netlify.com/sites/simde/)
### [Packit CI](https://dashboard.packit.dev/projects/github.com/simd-everywhere/simde)
### [Semaphore CI](https://nemequ.semaphoreci.com/projects/simde)
### [Travis](https://app.travis-ci.com/github/simd-everywhere/simde)
## Misc