Benchmark results look incorrect?

```
% export OMP_NUM_THREADS=1
% python -m blis.benchmark
Setting up data for gemm. 1000 iters,  nO=384 nI=384 batch_size=2000
Blis gemm...
Total: 11032014.6484375
9.54 seconds
Numpy (openblas) gemm...
Total: 11032015.625
9.50 seconds
Blis einsum ab,cb->ca
Total: 5510590.8203125
9.78 seconds
Numpy (openblas) einsum ab,cb->ca
unset OMP_NUM_THREADS
Total: 5510596.19140625
90.67 seconds
```

numpy with OpenBLAS and blis are on-par for gemm. However, this does not use intermediate optimization on numpy's einsum. Enabling this by passing `optimize=True`:

```
% python -m blis.benchmark
Setting up data for gemm. 1000 iters,  nO=384 nI=384 batch_size=2000
Blis gemm...
Total: 11032014.6484375
9.62 seconds
Numpy (openblas) gemm...
Total: 11032015.625
9.51 seconds
Blis einsum ab,cb->ca
Total: 5510590.8203125
9.70 seconds
Numpy (openblas) einsum ab,cb->ca
Total: 5510592.28515625
11.43 seconds
```

Only slightly slower than blis now. However, I am skeptical of the claim that parallelization does not help in inference. The matrix sizes used in the benchmark are fairly typical in inference (e.g. the standard transformer attention matrices are 768x768). Testing with 4 threads (fairly modest on current multi-core SMT CPUs):

```
% export OMP_NUM_THREADS=4
% python -m blis.benchmark
Setting up data for gemm. 1000 iters,  nO=384 nI=384 batch_size=2000
Blis gemm...
Total: 11032014.6484375
9.77 seconds
Numpy (openblas) gemm...
Total: 11032015.625
3.40 seconds
Blis einsum ab,cb->ca
Total: 5510590.8203125
9.83 seconds
Numpy (openblas) einsum ab,cb->ca
Total: 5510592.28515625
4.53 seconds
```

Maybe it's worthwhile compiling blis with multi-threading support?

For reference:

```
% lscpu | grep name:
Model name:          Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark results look incorrect? #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Benchmark results look incorrect? #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions