Skip to content

v1.1.2

Choose a tag to compare

@tprimak tprimak released this 24 Dec 21:06
· 12 commits to rls-v1.1 since this release

This is a patch release containing following changes to v1.1.1:

  • Fixed threading over the spatial in bfloat16 batched normalization (017b6c9)
  • Fixed read past end-of-buffer error for int8 convolution (7d6f45e)
  • Fixed condition for dispatching optimized channel blocking in fp32 backward convolution on Intel Xeon Phi(TM) processor (846eba1)
  • Fixed fp32 backward convolution for shapes with spatial strides over the depth dimension (002e3ab)
  • Fixed softmax with zero sizes on GPU (936bff4)
  • Fixed int8 deconvolution with dilation when ih <= dh (3e3bacb)
  • Enabled back fp32 -> u8 reorder for RNN (a2c2507)
  • Fixed segmentation fault in bfloat16 backward convolution from kd_padding=0 computation (52d476c)
  • Fixed segmentation fault in bfloat16 forward convolution due to push/pop imbalance (4f6e3d5)
  • Fixed library version for OS X build (0d85005)
  • Fixed padding by channels in concat (a265c7d)
  • Added full text of third party licenses and copyright notices to LICENSE file (79f204c)
  • Added separate README for binary packages (28f4c96)
  • Fixed computing per-oc mask in RNN (ff3ffab)
  • Added workaround for number of cores calculation in Xbyak (301b088)