Skip to content

v1.0.2

Choose a tag to compare

@tprimak tprimak released this 16 Aug 22:28
· 40 commits to rls-v1.0 since this release

This is a patch release containing following changes to Intel MKL-DNN v1.0.1:

  • Fixed issue with bfloat16 instructions detection in Xbyak (0f4ba11)
  • Fixed buffer size in packed GEMM (9764940)
  • Fixed offset calculation issue in weight update depthwise convolution in fp32 and bfloat16 kernels (6b9d412, 061499d)
  • Added check that size of generated kernel doesn't exceed the maximum allowed bound in fp32 forward and backward kernels (67e8cd2)
  • Various fixes in RNN primitive:
    • Proper handling of packed GEMM in extended GEMM (4eb9f56)
    • Force no-copy GEMM only for Intel AVX+ systems (2fbc8ba)
    • Avoid unaligned pointers usage in vex instructions in GRU cell (a147c08)
    • Fixed wrong dimension when creating GEMM primitive descriptor in reference RNN implementation for GPU (eb3c866)
    • Fixed Tanh backward calculation in GPU RNN reference implementation (f6e4b97)
    • Fixed pack GEMM dispatching for int8 (16b46c7)
    • Addressed bugs in tests for RNNs (cf83e83, f7c2de2, 960f3f3)