SME1 based direct kernel (with alpha and beta) for cblas_sgemm level 3#5380
Conversation
|
Thanks - this looks like an interesting addition (if not competitor) to the present sgemm_direct kernel. The CI error suggests that perhaps the entire kernel needs to be guarded with the __ARM_FEATURE_SME define (or the ...2VLx2VL function should have an empty alternative for when that feature macro is undefined) ? |
38540ea to
442273d
Compare
|
Hi Martin. Thanks for your quick review. It’s very helpful. I have updated the PR to address the two issues. (Add empty alternative for when feature macro is undefined. Modify the copyright statement) |
442273d to
366deb1
Compare
|
seems now we have AppleClang acting up over things in its own arm_sme header file |
366deb1 to
831c4e3
Compare
From the error log, I understand that error coming from clang-15.0.0 not supporting 'arm_streaming_' attributes. I tried to reproduce locally but found sme isn't supported in clang-15. For mitigation, I am adding extra guard along with __ARM_FEATURE_SME. I have pushed below update. If the above doesn't work, I can think of explicitly checking clang version. |
831c4e3 to
70ef30c
Compare
70ef30c to
eae0abf
Compare
|
@martin-frbg |
|
Both are unrelated - the loongarch job ran out of time and the IBM-Z build on Jenkins failed to access github (and still does today) |
|
Thanks! |
This PR contains support for sgemm_direct kernel (with support for alpha and beta) based on SME1 architecture.