Skip to content
Povilas Kanapickas edited this page Apr 18, 2014 · 26 revisions

Releases

The library is developed in C++11. A separate, C++03 branch is provided for compatibility with older compilers. Note that the master branch is unstable. If unsure, use one of the releases or at least the latest beta.

1.0

C++11 version

C++03-compatible version

Supported instruction sets: SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, FMA3, FMA4, XOP and NEON

Supported compilers:

  • C++11 version:

    • GCC: 4.8.1, 4.7.3
    • Clang: 3.3, 3.4
  • C++98 version

    • GCC: 4.8.1, 4.7.3
    • Clang: 3.3, 3.4

Roadmap

A 2.0 release of the library is planned. It contains lots of new features and a different high-level architecture which necessitated a major API break.

Implemented changes:

  • Expression template-based backend. It is used only for functions that may benefit from micro-optimizations.
  • Support for vectors much longer than the native vector type. The only limitation is that the length must be a power of 2. The widest available instructions are used for the particular vector type.
  • AVX-512F support
  • Vector initialization is simplified, for example: int32<8> v = make_uint(2); or int* p = ...; v = load(p);.
  • Curriously recurring template pattern is used to categorize vector types. Function templates no longer need to be written for each vector type or their combination, instead, an appropriate vector category may be used.
  • Each vector type can be explicitly constructed from any other vector with the same size.
  • Most functions accept much wider range of vector type combinations. For example, bitwise functions accept any two vectors of the same size.
  • If different vector types are used as arguments to such functions, the return type is computed as if one or both of the arguments were "promoted" according to certain rules. For example, int32 + int32 --> int32, whereas uint32 + int32 --> uint32, and uint32 + float32 --> float32. See simdpp/types/tag.h for more information.
  • API break: int128 and int256 types have been removed. On some architectures such as AVX512 it's more efficient to have different physical representations for vectors with different element widths. E.g. 8-bit integer elements would use 256-bit vectors and 32-bit integer elements would use 512-bit vectors.
  • API break: basic_int## types have been removed. The CRTP-based type categorization and promotion rules make second inheritance-based vector categorization system impossible. In majority of cases basic_int## can be straightforwardly replaced with uint##.
  • API break: {vector type}::make_const have been removed to simplify the library. Use the new make_int, make_uint and make_float free functions that produce a construct expression.
  • API break: broadcast family of functions have been renamed to splat
  • API break: permute family of functions has been renamed to permute2 and permute4 depending on the number of template arguments taken.
  • API break: value conversion functions such as to_float32x4 have been renamed and now returns a vector with the same number of elements as the source vector.
  • API break: saturated add and sub are now called sat_add and sat_sub
  • More API breaks... (grep for 'API break' or 'API-break' in the commit logs)

Planned changes:

  • Visual Studio support

Planned future changes:

  • Support for small vectors, such as int32<1>. Rationale: Currently there's no way to specify that we indeed want to perform operation on small number of vector elements. A possible solution is to extract appropriate C++ scalars such as int from the vector types and use regular C++. The problem with that is that often the most efficient way to do this is to keep the data in the SIMD execution domain and simply perform some wider vector operation. This is quite difficult for the compiler to spot. Implementing small vectors would allow to select the most efficient implementation for the target architecture.
Clone this wiki locally