-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Add CPUID for AvxVnniInt8 and AvxVnniInt16 #113956
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
6714249
to
141d643
Compare
@tannergooding This is first of the 2 PRs needed for AVX VNNI INT* API introduction #112586 |
src/coreclr/tools/Common/JitInterface/ThunkGenerator/InstructionSetDesc.txt
Outdated
Show resolved
Hide resolved
141d643
to
98fc970
Compare
@tannergooding @saucecontrol I have added the CPUID, API surface, JIT handling and template tests here. |
e6cf454
to
90fa072
Compare
src/tests/JIT/HardwareIntrinsics/X86_AvxVnniInt16/AvxVnniInt16/AvxVnniInt16SampleTest.cs
Outdated
Show resolved
Hide resolved
src/tests/JIT/HardwareIntrinsics/X86_AvxVnniInt16/Directory.Build.props
Outdated
Show resolved
Hide resolved
src/coreclr/jit/compiler.cpp
Outdated
// | ||
// Return Value: | ||
// The 64-bit only InstructionSet associated with isa | ||
static CORINFO_InstructionSet X64VersionOfIsa(CORINFO_InstructionSet isa) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: All of these could have stayed in the respective hwintrinsicxarch.cpp
and hwintrinsicarm64.cpp
files
We already implement other Compiler::*
methods in such files, so we should've been able to keep the diffs minimal by just changing lookupInstructionSet
to Compiler::lookupInstructionSet
and similar for lookupIsa
@@ -849,6 +849,17 @@ void CodeGen::genHWIntrinsic(GenTreeHWIntrinsic* node) | |||
break; | |||
} | |||
|
|||
case NI_AVXVNNI_MultiplyWideningAndAdd: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: This could be moved back to minimize the diff/code churn
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, minus a couple nits on ways to simplify the diffs.
CC. @dotnet/jit-contrib for secondary review
Co-authored-by: Tanner Gooding <[email protected]>
@khushal1996 could you please resolve the merge conflict? |
@@ -278,6 +289,23 @@ bool emitter::IsVexEncodableInstruction(instruction ins) const | |||
return emitComp->compSupportsHWIntrinsic(InstructionSet_AVXVNNI); | |||
} | |||
|
|||
case INS_vpdpwsud: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The merge conflict is notably highlighting that we missed adding these instructions to the perfScore
handling here in emitxarch.
With the new setup, we can just add the latency/throughput info directly as part of the instruction table instead, which makes the process more streamlined for the typical case.
If we don't have exact timings for these yet, then I'd mirror the values we used for the AVX-VNNI instructions instead.
This PR adds support for CPUID for
AVX-VNNI-INT8
&AVX-VNNI-INT16
ISAsDesign
The changes are made in a way to enable the 2 ISAs when
Avx10.2
is enabled orThis is w.r.t the discussions done in API proposal #112586
Testing
Note1: Emitter unit tests not ran since they are added and verified along with AVX10.2 PR #111209
Note2: Superpmi results are not accurate since we are adding a new CPUID and it leads to a new jiteeversionguid. Even after changing the jiteeversion manually, superpmi run shows errors and failures based on the old mch files which can be ignored.
Run JIT subtree with AVXVNNIINT* enabled / disabled
AVXVNNIINT* Enabled

AVXVNNIINT* disabled
