Release v3.9.1 · uxlfoundation/oneDNN

This is a patch release containing the following changes to v3.9:

Reduced sizes in Graph API SDPA examples (257d689)
Fixed correctness issue in bf16 depthwise convolution with bf16 bias on AArch64 CPUs (218b41d)
Changed Intel GPU data alignment check from error to warning (5c5008a)
Improved bf16 matmul performance on processors with Intel AMX instruction set support (54b6354, 30c4d8d)
Fixed PowerPC64 build by adding -mcpu=power10 and -mmma flags (02ca915)
Introduced support for f16 destination in int8 matmul and int8 inner product on x64 CPUs (a62ed6b, 53c0a66, 0750043, 4f0f068)
Introduced support per_tensor zero-points in int8 matmul on Intel GPUs (db8e8ff, f783164, 4d458df, 80453a0, 7f90d50, a2200e2)
Fixed correctness issue in int8 reorder for cases with compensation on x64 CPUs (771ca54)

Provide feedback