v1.0.3
This is a patch release containing following changes to v1.0.2:
- Fixed zero padding for memory formats with rank 3 and below (4d78aaf)
- Fixed tail scaling for int8 inner product (41b5a7e)
- Sum does not override the data type for destination memory descriptor when used with
any(e979eda) - Improved s8s8 GEMM and inner product performance (4b44aa5)
- Reduced memory consumption of GEMM-based algorithm for convolution weight gradient (f46b044)
- Fixed negative padding processing in pooling (48ba96a)
- Addressed memory leak in GPU deconvolution (686fc41)
- Addressed memory leak in GPU stream (1206b2f)
- Fixed fp16 GEMM correctness on GPU (c2425d4)
- Fixed GEMM correctness on GPU for the case of small M dimension (ac2683f)
- Addressed following corner cases in CPU convolution implementation:
- Fixed tail processing in int8 depthwise convolution (3a0943b)
- Fixed bias padding in bfloat16 depthwise convolution (3d9af7c)
- Fixed correctness issue in s8s8 flavor of depthwise convolution (e4d9049)
- Fixed correctness issue in GEMM-based algorithm for 3D convolutions (161ac40)
- Fixed corner case issues in Intel AVX512 implementation of convolution weight gradient (68f5124)
- Disabled not supported cases for depthwise convolution weight gradient (5e6e6c8)
- Convolution with 1x1 filter returns
unimplementedfor cases that have padding in spatial dimensions (9d7cc77) - Fixed negative padding support in general convolution kernel (b1c602a)
- Fixed padding handling in depthwise convolution backpropagation (04712f6)
- Added support for negative padding in
handdspatial dimensions (7ddce82) - Fixed segfault in strided convolution backpropagation (b04f3f5)
- Fixed memory corruption in convolution backpropagation (8877bc9)