-
Notifications
You must be signed in to change notification settings - Fork 49
[rocprofiler_sdk] SPM Review PR. #922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces Streaming Performance Monitor (SPM) support to the rocprofiler-sdk. SPM enables real-time monitoring of GPU performance counters during kernel execution with minimal overhead, providing an alternative to traditional counter collection methods.
- Adds experimental SPM APIs for agent-based and dispatch-based counter collection
- Implements SPM data structures, callbacks, and validation tests
- Integrates SPM with the rocprofiler-sdk-tool for CSV and JSON output
Reviewed Changes
Copilot reviewed 53 out of 54 changed files in this pull request and generated 6 comments.
Show a summary per file
File | Description |
---|---|
projects/rocprofiler-sdk/tests/spm/ | Validation test framework for SPM dispatch collection |
projects/rocprofiler-sdk/tests/rocprofv3/spm/ | rocprofv3 tool integration tests comparing SPM vs PMC |
projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/spm/ | Core SPM implementation and service configuration |
projects/rocprofiler-sdk/source/include/rocprofiler-sdk/experimental/spm/ | Public API headers for SPM functionality |
projects/rocprofiler-sdk/source/lib/rocprofiler-sdk-tool/ | Tool integration for SPM data collection and output |
projects/rocprofiler-sdk/samples/spm_agent_counter_collection/ | Sample application demonstrating SPM agent monitoring |
Comments suppressed due to low confidence (2)
projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/spm/spm_core.cpp:1
- malloc can return nullptr on failure, but the function always returns HSA_STATUS_SUCCESS. Should check for allocation failure and return appropriate error status.
// MIT License
projects/rocprofiler-sdk/source/lib/output/generateStats.cpp:1
- Duplicate function declaration. This appears to be a copy-paste error from the rocjpeg function above.
// MIT License
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
||
for record in counter_data: | ||
|
||
if record["agent_id"]["handle"] in agent_counter_map.keys(): |
Copilot
AI
Sep 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using in agent_counter_map
is more efficient than in agent_counter_map.keys()
as it avoids creating a temporary view of keys.
Copilot uses AI. Check for mistakes.
assert 1.0 * get_counter_value(counters, "SQ_INSTS_SALU") > get_counter_value( | ||
counters, "SQ_WAVES" | ||
) | ||
assert 1.0 * get_counter_value(counters, "SQ_INSTS_VALU") > get_counter_value( | ||
counters, "SQ_WAVES" | ||
) |
Copilot
AI
Sep 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assertion will fail if get_counter_value returns None for any counter. Add null checks or ensure get_counter_value returns a default value of 0.
Copilot uses AI. Check for mistakes.
pmc_csv = pmc_csv[pmc_csv["Kernel_Name"].str.contains("matrixTranspose")] | ||
spm_csv = spm_csv[spm_csv["Dispatch_Id"] == pmc_csv["Dispatch_Id"].values[0]] |
Copilot
AI
Sep 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code assumes pmc_csv is not empty after filtering. If no kernels match 'matrixTranspose', accessing values[0] will raise an IndexError.
Copilot uses AI. Check for mistakes.
{ | ||
auto& counters = *reinterpret_cast<counter_vec*>(userdata); | ||
|
||
// aqlprofile currently reports shadder_engine -1 as global counters |
Copilot
AI
Sep 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misspelled 'shader' as 'shadder' in the comment.
Copilot uses AI. Check for mistakes.
"type": "number", | ||
"description": "timestamp of sample." | ||
} |
Copilot
AI
Sep 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Missing closing period consistency. Other description fields end with periods, but this one has a trailing space.
Copilot uses AI. Check for mistakes.
"required": [ | ||
"counter_id", | ||
"value", | ||
"timestamp", | ||
"agent_id" | ||
] |
Copilot
AI
Sep 10, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The required field 'agent_id' is defined outside the record properties. It should be moved inside the properties section or the schema structure needs to be corrected.
Copilot uses AI. Check for mistakes.
## Associated JIRA ticket number/Github issue number <!-- For example: "Closes #1234" or "Fixes SWDEV-123456" --> ## What type of PR is this? (check all applicable) - [ ] Refactor - [ ] Feature - [ ] Bug Fix - [ ] Optimization - [ ] Documentation Update - [ ] Continuous Integration ## What were the changes? <!-- Please give a short summary of the change. --> ## Why are these changes needed? <!-- Please explain the motivation behind the change and why this solves the given problem. --> ## Updated CHANGELOG? <!-- Needed for Release updates for a ROCm release. --> - [ ] Yes - [ ] No, Does not apply to this PR. ## Added/Updated documentation? - [ ] Yes - [ ] No, Does not apply to this PR. ## Additional Checks - [ ] I have added tests relevant to the introduced functionality, and the unit tests are passing locally. - [ ] Any dependent changes have been merged.
bf81ba5
to
20d080c
Compare
40b1c2c
to
acb0ba3
Compare
- Add experimental SPM API headers under experimental/spm/ - Add main experimental/spm.h include file - Update experimental CMakeLists.txt to include SPM headers - Remove spm.h These headers define the public interface for the SPM (System Performance Monitoring) feature.
- Add SPM core service implementation (spm_core.cpp/hpp) - Add SPM data decoding functionality (spm_decode.cpp/hpp) - Add SPM service interface (spm_service.cpp) - Add SPM dynamic loading support (spm_dlsym.cpp/hpp) - Add SPM CMakeLists.txt build configuration This implements the core SPM functionality for system performance monitoring.
- Add SPM agent counter collection sample with GPU kernel examples - Add SPM test infrastructure with validation scripts - Update CMakeLists.txt to include SPM samples and tests Provides example usage and validation for the SPM functionality.
- Add SPM to main SDK CMakeLists.txt build - Update registration.cpp to support SPM services - Add SPM context support in context.cpp/hpp Integrates SPM into the existing SDK architecture and service registration.
- Add SPM domain type support for output generation - Add SPM counter information handling - Update CSV and stats generation for SPM data - Add SPM metadata support Enables SPM data to be output in standard rocprofiler formats.
- Add SPM configuration options to tool config - Add SPM service integration to main tool - Update tool to handle SPM data collection Enables rocprofv3 tool to utilize SPM functionality for profiling.
- Add SPM enum string conversion support - Add SPM serialization support - Add SPM metrics integration - Update counter metrics for SPM - Add SPM hsa integration - Add SPM counters integration - Add SPM packet construct support Integrates SPM with SDK utility and serialization systems.
- Update rocprofv3 schema to include SPM options - Update rocprofv3.py script with SPM support Documents SPM feature in user-facing tools and schemas.
acb0ba3
to
27b0edb
Compare
const char* expression; ///< Counter expression (derived counters only) | ||
uint8_t is_constant : 1; ///< If this counter is HW constant | ||
uint8_t is_derived : 1; ///< If this counter is a derived counter | ||
uint8_t is_spm : 1; ///< If this counter supports SPM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uint8_t is_spm : 1; ///< If this counter supports SPM | |
uint8_t supports_spm : 1; ///< If this counter supports SPM |
is_spm
suggests that a counter is either SPM or PMC.
projects/rocprofiler-sdk/source/include/rocprofiler-sdk/experimental/spm/CMakeLists.txt
Outdated
Show resolved
Hide resolved
using spm_counter_collection_buffered_output_t = | ||
buffered_output<tool_spm_counter_record_t, domain_type::STREAMING_PERFORMANCE_MONITOR>; | ||
using spm_counter_records_buffered_output_t = | ||
::rocprofiler::tool::buffered_output<rocprofiler::tool::tool_spm_counter_value_t, | ||
domain_type::SPM_COUNTER_VALUES>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using spm_counter_collection_buffered_output_t = | |
buffered_output<tool_spm_counter_record_t, domain_type::STREAMING_PERFORMANCE_MONITOR>; | |
using spm_counter_records_buffered_output_t = | |
::rocprofiler::tool::buffered_output<rocprofiler::tool::tool_spm_counter_value_t, | |
domain_type::SPM_COUNTER_VALUES>; | |
using spm_counter_collection_buffered_output_t = | |
buffered_output<tool_spm_counter_record_t, domain_type::STREAMING_PERFORMANCE_MONITOR>; | |
using spm_counter_records_buffered_output_t = | |
buffered_output<tool_spm_counter_value_t, domain_type::SPM_COUNTER_VALUES>; |
using pc_sampling_host_trap_csv_encoder = csv_encoder<6>; | ||
using kernel_trace_with_stream_csv_encoder = csv_encoder<22>; | ||
using memory_copy_with_stream_csv_encoder = csv_encoder<8>; | ||
using spm_csv_encoder = csv_encoder<3>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should replicate counter_collection_csv_encoder
Aggregate PR that breaks down the existing single commit to something a bit more reviewable.