Skip to content

Commit eae94af

Browse files
authored
Merge pull request #6 from ChillFish8/CFAVML-2-add-comparrision-ops
CFAVML-2: Add cmp ops (`eq/neq/lt/lte/gt/gte`)
2 parents 31db7e4 + 5615fa7 commit eae94af

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

47 files changed

+6560
-1124
lines changed

.github/workflows/miri.yaml

Lines changed: 1 addition & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,5 @@
11
name: Run Miri
2-
on:
3-
pull_request:
4-
branches:
5-
- main
6-
push:
7-
branches:
8-
- main
2+
on: workflow_dispatch
93

104
jobs:
115
miri:

.idea/runConfigurations/Test_CFAVML_w_nightly.xml

Lines changed: 1 addition & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

.idea/runConfigurations/Test_CFAVML_wo_nightly.xml

Lines changed: 3 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

cfavml-gemm/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ description = "BLAS-like general matrix multiplication extension for `cfavml`."
77
[dependencies]
88
num_cpus = "1.16.0"
99

10-
cfavml = { version = "0.2", path = "../cfavml" }
10+
cfavml = { version = "0.3", path = "../cfavml" }
1111
cfavml-utils = { version = "0.1", path = "../cfavml-utils" }
1212

1313
[dev-dependencies]

cfavml/Cargo.toml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[package]
22
name = "cfavml"
3-
version = "0.2.0"
3+
version = "0.3.0"
44
edition = "2021"
55
rust-version = "1.75"
66
authors = ["Harrison Burt <[email protected]>"]
@@ -21,7 +21,6 @@ rand_chacha = "0.3.1"
2121
paste = "1.0.14"
2222
divan = "0.1.14"
2323
num-traits = "0.2.19"
24-
mimalloc = { version = "0.1.43", default-features = false }
2524
simsimd = "5.0.1"
2625

2726
[target.'cfg(unix)'.dev-dependencies]

cfavml/README.md

Lines changed: 36 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,19 +12,31 @@ feature flag:
1212

1313
##### Default Setup
1414
```toml
15-
cfavml = "0.2.0"
15+
cfavml = "0.3.0"
1616
```
1717

1818
##### No-std Setup
1919
```toml
20-
cfavml = { version = "0.2.0", default-features = false }
20+
cfavml = { version = "0.3.0", default-features = false }
2121
```
2222

23+
### Important Version Upgrade Notes
24+
25+
If you are upgrading on a breaking release, i.e. `0.2.0` to `0.3.0` there may be some important
26+
changes that affects your system, although the public _safe_ APIs I try my best to avoid breaking.
27+
28+
- AVX512 required CPU features changed in `0.3.0+`
29+
* In versions older than `0.3.0` avx512 was used when only the `avx512f` cpu feature was available
30+
since this is the base/foundation version of AVX512. However, in `0.3.0` we introduced more extensive
31+
cmp operations (`eq/neq/lt/lte/gt/gte`) which changed our required CPU features to include `avx512bw`
32+
* **This means on _unsafe_ APIs you must update your feature checks to include `avx512bw`.**
33+
* **Safe APIs do not require changes but may fallback to AVX2 on some of the first gen AVX512 CPUs, i.e. Skylake**
34+
2335
### Available SIMD Architectures
2436

2537
- AVX2
2638
- AVX2 + FMA
27-
- AVX512
39+
- AVX512 (`avx512f` + `avx512bw`) _nightly only_
2840
- NEON
2941
- Fallback (Typically optimized to SSE automatically by LLVM on x86)
3042

@@ -50,8 +62,8 @@ with SIMD and adds a significant amount of cognitive overhead when reading the c
5062
Although to be honest I have some serious questions about your application if you're doing
5163
heavy integer division...
5264

53-
### Supported Operations & Distances
5465

66+
## Supported Operations
5567

5668
### Spacial distances
5769

@@ -80,6 +92,8 @@ These are routines that can be used for things like KNN classification or index
8092
- Vertical min element of two vectors
8193
- Vertical max element of a vector and broadcast value
8294
- Vertical min element of a vector and broadcast value
95+
- EQ/NEQ/LT/LTE/GT/GTE cmp of a vector and broadcast value
96+
- EQ/NEQ/LT/LTE/GT/GTE cmp of two vectors
8397

8498
### Aggregation
8599

@@ -102,10 +116,24 @@ provided as generic functions (with no target features):
102116
- `generic_squared_euclidean`
103117
- `generic_cosine`
104118
- `generic_squared_norm`
105-
- `generic_max_horizontal`
106-
- `generic_max_vector`
107-
- `generic_min_horizontal`
108-
- `generic_min_vector`
119+
- `generic_cmp_max`
120+
- `generic_cmp_max_vector`
121+
- `generic_cmp_max_value`
122+
- `generic_cmp_min`
123+
- `generic_cmp_min_vector`
124+
- `generic_cmp_min_value`
125+
- `generic_cmp_eq_vector`
126+
- `generic_cmp_eq_value`
127+
- `generic_cmp_neq_vector`
128+
- `generic_cmp_neq_value`
129+
- `generic_cmp_lt_vector`
130+
- `generic_cmp_lt_value`
131+
- `generic_cmp_lte_vector`
132+
- `generic_cmp_lte_value`
133+
- `generic_cmp_gt_vector`
134+
- `generic_cmp_gt_value`
135+
- `generic_cmp_gte_vector`
136+
- `generic_cmp_gte_value`
109137
- `generic_sum`
110138
- `generic_add_value`
111139
- `generic_sub_value`

0 commit comments

Comments
 (0)