Skip to content

v1.0.3

Choose a tag to compare

@vpirogov vpirogov released this 22 Oct 19:57
· 8 commits to rls-v1.0 since this release

This is a patch release containing following changes to v1.0.2:

  • Fixed zero padding for memory formats with rank 3 and below (4d78aaf)
  • Fixed tail scaling for int8 inner product (41b5a7e)
  • Sum does not override the data type for destination memory descriptor when used with any (e979eda)
  • Improved s8s8 GEMM and inner product performance (4b44aa5)
  • Reduced memory consumption of GEMM-based algorithm for convolution weight gradient (f46b044)
  • Fixed negative padding processing in pooling (48ba96a)
  • Addressed memory leak in GPU deconvolution (686fc41)
  • Addressed memory leak in GPU stream (1206b2f)
  • Fixed fp16 GEMM correctness on GPU (c2425d4)
  • Fixed GEMM correctness on GPU for the case of small M dimension (ac2683f)
  • Addressed following corner cases in CPU convolution implementation:
    • Fixed tail processing in int8 depthwise convolution (3a0943b)
    • Fixed bias padding in bfloat16 depthwise convolution (3d9af7c)
    • Fixed correctness issue in s8s8 flavor of depthwise convolution (e4d9049)
    • Fixed correctness issue in GEMM-based algorithm for 3D convolutions (161ac40)
    • Fixed corner case issues in Intel AVX512 implementation of convolution weight gradient (68f5124)
    • Disabled not supported cases for depthwise convolution weight gradient (5e6e6c8)
    • Convolution with 1x1 filter returns unimplemented for cases that have padding in spatial dimensions (9d7cc77)
    • Fixed negative padding support in general convolution kernel (b1c602a)
    • Fixed padding handling in depthwise convolution backpropagation (04712f6)
    • Added support for negative padding in h and d spatial dimensions (7ddce82)
    • Fixed segfault in strided convolution backpropagation (b04f3f5)
    • Fixed memory corruption in convolution backpropagation (8877bc9)