Skip to content

Conversation

bwelton
Copy link
Contributor

@bwelton bwelton commented Sep 10, 2025

Aggregate PR that breaks down the existing single commit to something a bit more reviewable.

@Copilot Copilot AI review requested due to automatic review settings September 10, 2025 01:25
@bwelton bwelton requested a review from a team as a code owner September 10, 2025 01:25
@github-actions github-actions bot added documentation Improvements or additions to documentation project: rocprofiler-sdk labels Sep 10, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces Streaming Performance Monitor (SPM) support to the rocprofiler-sdk. SPM enables real-time monitoring of GPU performance counters during kernel execution with minimal overhead, providing an alternative to traditional counter collection methods.

  • Adds experimental SPM APIs for agent-based and dispatch-based counter collection
  • Implements SPM data structures, callbacks, and validation tests
  • Integrates SPM with the rocprofiler-sdk-tool for CSV and JSON output

Reviewed Changes

Copilot reviewed 53 out of 54 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
projects/rocprofiler-sdk/tests/spm/ Validation test framework for SPM dispatch collection
projects/rocprofiler-sdk/tests/rocprofv3/spm/ rocprofv3 tool integration tests comparing SPM vs PMC
projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/spm/ Core SPM implementation and service configuration
projects/rocprofiler-sdk/source/include/rocprofiler-sdk/experimental/spm/ Public API headers for SPM functionality
projects/rocprofiler-sdk/source/lib/rocprofiler-sdk-tool/ Tool integration for SPM data collection and output
projects/rocprofiler-sdk/samples/spm_agent_counter_collection/ Sample application demonstrating SPM agent monitoring
Comments suppressed due to low confidence (2)

projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/spm/spm_core.cpp:1

  • malloc can return nullptr on failure, but the function always returns HSA_STATUS_SUCCESS. Should check for allocation failure and return appropriate error status.
// MIT License

projects/rocprofiler-sdk/source/lib/output/generateStats.cpp:1

  • Duplicate function declaration. This appears to be a copy-paste error from the rocjpeg function above.
// MIT License

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.


for record in counter_data:

if record["agent_id"]["handle"] in agent_counter_map.keys():
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using in agent_counter_map is more efficient than in agent_counter_map.keys() as it avoids creating a temporary view of keys.

Copilot uses AI. Check for mistakes.

Comment on lines +91 to +96
assert 1.0 * get_counter_value(counters, "SQ_INSTS_SALU") > get_counter_value(
counters, "SQ_WAVES"
)
assert 1.0 * get_counter_value(counters, "SQ_INSTS_VALU") > get_counter_value(
counters, "SQ_WAVES"
)
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion will fail if get_counter_value returns None for any counter. Add null checks or ensure get_counter_value returns a default value of 0.

Copilot uses AI. Check for mistakes.

Comment on lines +52 to +53
pmc_csv = pmc_csv[pmc_csv["Kernel_Name"].str.contains("matrixTranspose")]
spm_csv = spm_csv[spm_csv["Dispatch_Id"] == pmc_csv["Dispatch_Id"].values[0]]
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code assumes pmc_csv is not empty after filtering. If no kernels match 'matrixTranspose', accessing values[0] will raise an IndexError.

Copilot uses AI. Check for mistakes.

{
auto& counters = *reinterpret_cast<counter_vec*>(userdata);

// aqlprofile currently reports shadder_engine -1 as global counters
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misspelled 'shader' as 'shadder' in the comment.

Copilot uses AI. Check for mistakes.

Comment on lines +1193 to +1195
"type": "number",
"description": "timestamp of sample."
}
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Missing closing period consistency. Other description fields end with periods, but this one has a trailing space.

Copilot uses AI. Check for mistakes.

Comment on lines +1200 to +1205
"required": [
"counter_id",
"value",
"timestamp",
"agent_id"
]
Copy link

Copilot AI Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The required field 'agent_id' is defined outside the record properties. It should be moved inside the properties section or the schema structure needs to be corrected.

Copilot uses AI. Check for mistakes.

systems-assistant bot pushed a commit that referenced this pull request Sep 11, 2025
## Associated JIRA ticket number/Github issue number
<!-- For example: "Closes #1234" or "Fixes SWDEV-123456" -->

## What type of PR is this? (check all applicable)

- [ ] Refactor
- [ ] Feature
- [ ] Bug Fix
- [ ] Optimization
- [ ] Documentation Update
- [ ] Continuous Integration

## What were the changes?

<!-- Please give a short summary of the change. -->

## Why are these changes needed?

<!-- Please explain the motivation behind the change and why this solves
the given problem. -->

## Updated CHANGELOG?

<!-- Needed for Release updates for a ROCm release. -->

- [ ] Yes
- [ ] No, Does not apply to this PR.

## Added/Updated documentation?

- [ ] Yes
- [ ] No, Does not apply to this PR.

## Additional Checks

- [ ] I have added tests relevant to the introduced functionality, and
the unit tests are passing locally.
- [ ] Any dependent changes have been merged.
@SrirakshaNag SrirakshaNag force-pushed the bewelton/spm_review branch 3 times, most recently from bf81ba5 to 20d080c Compare September 11, 2025 01:21
@bgopesh bgopesh mentioned this pull request Sep 16, 2025
1 task
@SrirakshaNag SrirakshaNag force-pushed the bewelton/spm_review branch 4 times, most recently from 40b1c2c to acb0ba3 Compare September 16, 2025 23:24
Benjamin Welton and others added 9 commits September 17, 2025 11:44
- Add experimental SPM API headers under experimental/spm/
- Add main experimental/spm.h include file
- Update experimental CMakeLists.txt to include SPM headers
- Remove spm.h

These headers define the public interface for the SPM (System Performance Monitoring) feature.
- Add SPM core service implementation (spm_core.cpp/hpp)
- Add SPM data decoding functionality (spm_decode.cpp/hpp)
- Add SPM service interface (spm_service.cpp)
- Add SPM dynamic loading support (spm_dlsym.cpp/hpp)
- Add SPM CMakeLists.txt build configuration

This implements the core SPM functionality for system performance monitoring.
- Add SPM agent counter collection sample with GPU kernel examples
- Add SPM test infrastructure with validation scripts
- Update CMakeLists.txt to include SPM samples and tests

Provides example usage and validation for the SPM functionality.
- Add SPM to main SDK CMakeLists.txt build
- Update registration.cpp to support SPM services
- Add SPM context support in context.cpp/hpp

Integrates SPM into the existing SDK architecture and service registration.
- Add SPM domain type support for output generation
- Add SPM counter information handling
- Update CSV and stats generation for SPM data
- Add SPM metadata support

Enables SPM data to be output in standard rocprofiler formats.
- Add SPM configuration options to tool config
- Add SPM service integration to main tool
- Update tool to handle SPM data collection

Enables rocprofv3 tool to utilize SPM functionality for profiling.
- Add SPM enum string conversion support
- Add SPM serialization support
- Add SPM metrics integration
- Update counter metrics for SPM
- Add SPM hsa integration
- Add SPM counters integration
- Add SPM packet construct support

Integrates SPM with SDK utility and serialization systems.
- Update rocprofv3 schema to include SPM options
- Update rocprofv3.py script with SPM support

Documents SPM feature in user-facing tools and schemas.
const char* expression; ///< Counter expression (derived counters only)
uint8_t is_constant : 1; ///< If this counter is HW constant
uint8_t is_derived : 1; ///< If this counter is a derived counter
uint8_t is_spm : 1; ///< If this counter supports SPM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
uint8_t is_spm : 1; ///< If this counter supports SPM
uint8_t supports_spm : 1; ///< If this counter supports SPM

is_spm suggests that a counter is either SPM or PMC.

Comment on lines 175 to 179
using spm_counter_collection_buffered_output_t =
buffered_output<tool_spm_counter_record_t, domain_type::STREAMING_PERFORMANCE_MONITOR>;
using spm_counter_records_buffered_output_t =
::rocprofiler::tool::buffered_output<rocprofiler::tool::tool_spm_counter_value_t,
domain_type::SPM_COUNTER_VALUES>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
using spm_counter_collection_buffered_output_t =
buffered_output<tool_spm_counter_record_t, domain_type::STREAMING_PERFORMANCE_MONITOR>;
using spm_counter_records_buffered_output_t =
::rocprofiler::tool::buffered_output<rocprofiler::tool::tool_spm_counter_value_t,
domain_type::SPM_COUNTER_VALUES>;
using spm_counter_collection_buffered_output_t =
buffered_output<tool_spm_counter_record_t, domain_type::STREAMING_PERFORMANCE_MONITOR>;
using spm_counter_records_buffered_output_t =
buffered_output<tool_spm_counter_value_t, domain_type::SPM_COUNTER_VALUES>;

using pc_sampling_host_trap_csv_encoder = csv_encoder<6>;
using kernel_trace_with_stream_csv_encoder = csv_encoder<22>;
using memory_copy_with_stream_csv_encoder = csv_encoder<8>;
using spm_csv_encoder = csv_encoder<3>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should replicate counter_collection_csv_encoder

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants