[Feature Request] Need Matmul Attention layer instead of Einsum to support GPU running #502

Open

Open

[Feature Request] Need Matmul Attention layer instead of Einsum to support GPU running#502

Einsum kernel in Praxis couldn't' be lowered to cudnn GEMM. The computing performance is seriously affected. Jax version Attention layer much slower than Tensorflow version.

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests