v0.21
Performance optimizations
- Improved int8 and fp32 GEMM and inner product performance.
- Improved reorder performance for certain shapes.
- Improved RNN, LSTM, GRU and LBR-GRU training performance.
New functionality
- Added GELU activation support.
Thanks to the contributors
This release contains contributions from many Intel Performance Libraries developers. We would also like to thank everyone who asked questions and reported issues.