@@ -12,19 +12,31 @@ feature flag:
1212
1313##### Default Setup
1414``` toml
15- cfavml = " 0.2 .0"
15+ cfavml = " 0.3 .0"
1616```
1717
1818##### No-std Setup
1919``` toml
20- cfavml = { version = " 0.2 .0" , default-features = false }
20+ cfavml = { version = " 0.3 .0" , default-features = false }
2121```
2222
23+ ### Important Version Upgrade Notes
24+
25+ If you are upgrading on a breaking release, i.e. ` 0.2.0 ` to ` 0.3.0 ` there may be some important
26+ changes that affects your system, although the public _ safe_ APIs I try my best to avoid breaking.
27+
28+ - AVX512 required CPU features changed in ` 0.3.0+ `
29+ * In versions older than ` 0.3.0 ` avx512 was used when only the ` avx512f ` cpu feature was available
30+ since this is the base/foundation version of AVX512. However, in ` 0.3.0 ` we introduced more extensive
31+ cmp operations (` eq/neq/lt/lte/gt/gte ` ) which changed our required CPU features to include ` avx512bw `
32+ * ** This means on _ unsafe_ APIs you must update your feature checks to include ` avx512bw ` .**
33+ * ** Safe APIs do not require changes but may fallback to AVX2 on some of the first gen AVX512 CPUs, i.e. Skylake**
34+
2335### Available SIMD Architectures
2436
2537- AVX2
2638- AVX2 + FMA
27- - AVX512
39+ - AVX512 ( ` avx512f ` + ` avx512bw ` ) _ nightly only _
2840- NEON
2941- Fallback (Typically optimized to SSE automatically by LLVM on x86)
3042
@@ -50,8 +62,8 @@ with SIMD and adds a significant amount of cognitive overhead when reading the c
5062Although to be honest I have some serious questions about your application if you're doing
5163heavy integer division...
5264
53- ### Supported Operations & Distances
5465
66+ ## Supported Operations
5567
5668### Spacial distances
5769
@@ -80,6 +92,8 @@ These are routines that can be used for things like KNN classification or index
8092- Vertical min element of two vectors
8193- Vertical max element of a vector and broadcast value
8294- Vertical min element of a vector and broadcast value
95+ - EQ/NEQ/LT/LTE/GT/GTE cmp of a vector and broadcast value
96+ - EQ/NEQ/LT/LTE/GT/GTE cmp of two vectors
8397
8498### Aggregation
8599
@@ -102,10 +116,24 @@ provided as generic functions (with no target features):
102116- ` generic_squared_euclidean `
103117- ` generic_cosine `
104118- ` generic_squared_norm `
105- - ` generic_max_horizontal `
106- - ` generic_max_vector `
107- - ` generic_min_horizontal `
108- - ` generic_min_vector `
119+ - ` generic_cmp_max `
120+ - ` generic_cmp_max_vector `
121+ - ` generic_cmp_max_value `
122+ - ` generic_cmp_min `
123+ - ` generic_cmp_min_vector `
124+ - ` generic_cmp_min_value `
125+ - ` generic_cmp_eq_vector `
126+ - ` generic_cmp_eq_value `
127+ - ` generic_cmp_neq_vector `
128+ - ` generic_cmp_neq_value `
129+ - ` generic_cmp_lt_vector `
130+ - ` generic_cmp_lt_value `
131+ - ` generic_cmp_lte_vector `
132+ - ` generic_cmp_lte_value `
133+ - ` generic_cmp_gt_vector `
134+ - ` generic_cmp_gt_value `
135+ - ` generic_cmp_gte_vector `
136+ - ` generic_cmp_gte_value `
109137- ` generic_sum `
110138- ` generic_add_value `
111139- ` generic_sub_value `
0 commit comments