-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[vanadis] Implement the missing Zicntr extension instructions. #2372
base: devel
Are you sure you want to change the base?
Conversation
Chapter 8 in https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-f797123-2024-06-27/riscv-unprivileged.pdf details the RISC-V counter interface. There are 3 bespoke counters detailed in the Zicntr extensions in 8.1, and up to 29 more user-programmable counters detailed in the Zihpm extenstion in 8.2. This commit provides space for all 32 counters in the register file, but only currently implements the three Zicntr counters. The implementation consists of a few changes. 1. I have extended the register file structure with 32, 64 bit counters, and `increment` and `get` members in order to update and read those counters. 2. I have added a new instruction, `Zicntr::VanadisReadCounterInstruction` in order to read those counters. This instruction can be tailored for any of the 3 `Zicntr`s (implemented as tag types in `inst/zicntr.h`), XLEN=64 or 32, and with or without the `[H]` extension. 3. I have added decoding to `decoder/vriscv64decoder.h` to handle `RDCYCLE`, `RDTIME`, and `RDINSTRET`. Because the `[H]` and XLEN=32 versions are only available in `riscv32`, the `riscv64` decoder simply injects a decoding failure if those occur. 4. I have extended the core's `tick` to update the three `Zicntr`s. 5. I have added a small misc test example to demonstrate usage. I have not implemented any functionality for programming the 29 `Zihpm` counters, nor have I added any special decoding to read these registers. I have decided that the `Zicntr` instructions will be processed by the arithmetic functional unit. I don't know if this is appropriate, but it should manage any register port contention properly and seemed like the most expedient solution. I have removed the `cycle_count` and the `getCycleCount` from the decoder base class, as it is no longer being used. The current cycle is still passed to the decoder `tick()` even though it is _also_ still unused.
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
Is there someone that can approve this workflow? |
@ldalessa We'll get this reviewed. Our PR testing infrastructure is in the midst of upgrades and so test/merge will be delayed until it is back up. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to construct another test case for RV64 and do some more digging on the 'arbitrary' piece of the standard.
The biggest holdup is on the MIPS32 side since you've removed the cycle count. We need to see if there's a way to keep this.
Thanks for looking at this. I'm pretty sure that the mips decoder did nothing with the cycle count.
If you want to retain it in the vmipsdecoder I think it makes sense to make |
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED by label AT: PRE-TEST INSPECTED! Autotester is Removing Label; this inspection will remain valid until a new commit to source branch is performed. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.6_sst-elements
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.6_sst-elements_Make-Dist
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.6_sst-elements_MT-2
Build InformationTest Name: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.6_sst-elements_MR-2
Build InformationTest Name: SST__AutotestGen2_NewFW_OSX-14-XC15-ARM2_OMPI-4.1.6_PY3.10_sst-elements
Using Repos:
Pull Request Author: ldalessa |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 4 Hrs. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Job: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.6_sst-elements
Job: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.6_sst-elements_Make-Dist
Job: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.6_sst-elements_MT-2
Job: SST__AutotestGen2_NewFW_sst-test_OMPI-4.1.4_PY3.6_sst-elements_MR-2
Job: SST__AutotestGen2_NewFW_OSX-14-XC15-ARM2_OMPI-4.1.6_PY3.10_sst-elements
Test Results
|
@ldalessa This PR fails to compile in our tests. ./decoder/vriscv64decoder.h: In member function 'void SST::Vanadis::VanadisRISCV64Decoder::decode(SST::Output*, uint64_t, uint32_t, SST::Vanadis::VanadisInstructionBundle*)':
./decoder/vriscv64decoder.h:1112:132: error: class template argument deduction failed:
bundle->addInstruction( new VanadisReadCounterInstruction( CYCLE, ins_address, hw_thr, options, rd ) );
^
./decoder/vriscv64decoder.h:1112:132: error: no matching function for call to 'VanadisReadCounterInstruction()'
In file included from ./inst/vinstall.h:125,
from ./decoder/vriscv64decoder.h:20,
from decoder/vriscv64decoder.cc:19:
./inst/vzicntr_readcounter.h:20:13: note: candidate: 'template<unsigned int id, long unsigned int XLEN, bool H> VanadisReadCounterInstruction(std::integral_constant<unsigned int, id>, uint64_t, uint32_t, const SST::Vanadis::VanadisDecoderOptions*, uint16_t)-> SST::Vanadis::Zicntr::VanadisReadCounterInstruction<id, XLEN, H>'
VanadisReadCounterInstruction(
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./inst/vzicntr_readcounter.h:20:13: note: template argument deduction/substitution failed:
In file included from decoder/vriscv64decoder.cc:19:
./decoder/vriscv64decoder.h:1112:132: note: candidate expects 5 arguments, 0 provided
bundle->addInstruction( new VanadisReadCounterInstruction( CYCLE, ins_address, hw_thr, options, rd ) );
^
./decoder/vriscv64decoder.h:1117:131: error: class template argument deduction failed:
bundle->addInstruction( new VanadisReadCounterInstruction( TIME, ins_address, hw_thr, options, rd ) );
^
./decoder/vriscv64decoder.h:1117:131: error: no matching function for call to 'VanadisReadCounterInstruction()'
In file included from ./inst/vinstall.h:125,
from ./decoder/vriscv64decoder.h:20,
from decoder/vriscv64decoder.cc:19:
./inst/vzicntr_readcounter.h:20:13: note: candidate: 'template<unsigned int id, long unsigned int XLEN, bool H> VanadisReadCounterInstruction(std::integral_constant<unsigned int, id>, uint64_t, uint32_t, const SST::Vanadis::VanadisDecoderOptions*, uint16_t)-> SST::Vanadis::Zicntr::VanadisReadCounterInstruction<id, XLEN, H>'
VanadisReadCounterInstruction(
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./inst/vzicntr_readcounter.h:20:13: note: template argument deduction/substitution failed:
In file included from decoder/vriscv64decoder.cc:19:
./decoder/vriscv64decoder.h:1117:131: note: candidate expects 5 arguments, 0 provided
bundle->addInstruction( new VanadisReadCounterInstruction( TIME, ins_address, hw_thr, options, rd ) );
^
./decoder/vriscv64decoder.h:1122:134: error: class template argument deduction failed:
bundle->addInstruction( new VanadisReadCounterInstruction( INSTRET, ins_address, hw_thr, options, rd ) );
^
./decoder/vriscv64decoder.h:1122:134: error: no matching function for call to 'VanadisReadCounterInstruction()'
In file included from ./inst/vinstall.h:125,
from ./decoder/vriscv64decoder.h:20,
from decoder/vriscv64decoder.cc:19:
./inst/vzicntr_readcounter.h:20:13: note: candidate: 'template<unsigned int id, long unsigned int XLEN, bool H> VanadisReadCounterInstruction(std::integral_constant<unsigned int, id>, uint64_t, uint32_t, const SST::Vanadis::VanadisDecoderOptions*, uint16_t)-> SST::Vanadis::Zicntr::VanadisReadCounterInstruction<id, XLEN, H>'
VanadisReadCounterInstruction(
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./inst/vzicntr_readcounter.h:20:13: note: template argument deduction/substitution failed:
In file included from decoder/vriscv64decoder.cc:19:
./decoder/vriscv64decoder.h:1122:134: note: candidate expects 5 arguments, 0 provided
bundle->addInstruction( new VanadisReadCounterInstruction( INSTRET, ins_address, hw_thr, options, rd ) );
^ |
@ldalessa |
Yeah. I’ll take a look at it. It’s likely to be a simple-but-annoying fix. I didn’t realize there was support for such old toolchains. |
Chapter 8 in
https://github.com/riscv/riscv-isa-manual/releases/download/riscv-isa-release-f797123-2024-06-27/riscv-unprivileged.pdf details the RISC-V counter interface.
There are 3 bespoke counters detailed in the Zicntr extensions in 8.1, and up to 29 more user-programmable counters detailed in the Zihpm extenstion in 8.2. This commit provides space for all 32 counters in the register file, but only currently implements the three Zicntr counters.
The implementation consists of a few changes.
increment
andget
members in order to update and read those counters.Zicntr::VanadisReadCounterInstruction
in order to read those counters. This instruction can be tailored for any of the 3Zicntr
s (implemented as tag types ininst/zicntr.h
), XLEN=64 or 32, and with or without the[H]
extension.decoder/vriscv64decoder.h
to handleRDCYCLE
,RDTIME
, andRDINSTRET
. Because the[H]
and XLEN=32 versions are only available inriscv32
, theriscv64
decoder simply injects a decoding failure if those occur.tick
to update the threeZicntr
s.I have not implemented any functionality for programming the 29
Zihpm
counters, nor have I added any special decoding to read these registers.I have decided that the
Zicntr
instructions will be processed by the arithmetic functional unit. I don't know if this is appropriate, but it should manage any register port contention properly and seemed like the most expedient solution.I have removed the
cycle_count
and thegetCycleCount
from the decoder base class, as it is no longer being used. The current cycle is still passed to the decodertick()
even though it is also still unused.