Improve baseline comparison logic #817

vedithal-amd · 2025-07-17T18:41:07Z

rocprofiler-compute Pull Request

Related Issue

Closes #

What type of PR is this? (check all that apply)

Technical Details

Do not force unsupported metrics to be specified in older gpu
architectures as None
Remove metrics which are explicitly set to None
Update CHANGELOG
Fix analysis configuration to fix baseline comparisons across all gpu
architectures
- Add missing 1812 section for gfx908
- Add missing 1812 section for gfx90a

We cannot guarantee common Metric ID index for metrics comparison across different GPU models.
The Metric ID is not very useful anyways and this is how it looks with and without Metric ID

With Metric ID

Without Metric ID

Have you added or updated tests to validate functionality?

Yes
No - does not apply to this PR
Validated baseline comparison working across all gpu architectures

Added / Updated documentation?

Yes
No - does not apply to this PR

Have you updated CHANGELOG?

Yes
No - does not apply to this PR

feizheng10

Please separate gfx906 deprecation out of this pr.
Keep block IDs is important. When we are talking about "baseline comparison", the 1st run is used as baseline reference, which implies all the info from the 1st run are the baseline including block IDs.

feizheng10 · 2025-07-17T19:07:31Z

And just for internal maintain/design purpose, we have meta info in the comments(temporally) for the reasons why we don't have the metrics available. It is better to provide solution when removing those.

vedithal-amd · 2025-07-17T19:41:22Z

1st run is used as baseline reference, which implies all the info from the 1st run are the baseline including block IDs.

My concern is, does it make sense to compare a metric when its only available in one run?
Lets say 1st run has Metric1 = 10, Metric2 = 10 and 2nd run does not have Metric1 but Metric2 = 5.
In above case, does it make sense to only compare Metric2 which is common?
Why will the user want to see Metric1 in baseline comparison when it is not there in run 2? Why cant the user just do analysis of run 1 to see Metric1?

Please separate gfx906 deprecation out of this pr.

OK

And just for internal maintain/design purpose, we have meta info in the comments(temporally) for the reasons why we don't have the metrics available. It is better to provide solution when removing those.

We can backport the metrics from MI350 config to older GPUs when someone ask for it in a different PR.
There is no point in having None metrics when you can look at the metric information in newer MI350 GPU config.

vedithal-amd · 2025-07-17T19:49:37Z

Please separate gfx906 deprecation out of this pr.

This will be tracked in #819

* Do not force unsupported metrics to be specified in older gpu architectures as None * Remove metrics which are explicitly set to None * Update CHANGELOG * Fix analysis configuration to fix baseline comparisons across all gpu architectures * Add missing 1812 section for gfx908 * Add missing 1812 section for gfx90a

vedithal-amd · 2025-07-17T19:54:55Z

Keep block IDs is important.

There is no such thing as block IDs, I think you are referring to Metric ID and i dont think that adds any value to the analysis report. Metric ID will still be shown in non baseline comparison mode. Only in baseline comparison mode its not able to show Metric ID due to metrics being different.

vedithal-amd · 2025-07-18T16:12:20Z

As discussed on call, we can add Metric ID from first workload during baseline comparison.

vedithal-amd requested review from prbasyal-amd, a team, coleramos425, feizheng10, xuchen-amd, cfallows-amd, ywang103-amd and jamessiddeley-amd as code owners July 17, 2025 18:41

vedithal-amd force-pushed the improve-baseline branch from 6b94a8f to d061c1c Compare July 17, 2025 18:41

feizheng10 requested changes Jul 17, 2025

View reviewed changes

vedithal-amd force-pushed the improve-baseline branch from d061c1c to 60b2879 Compare July 17, 2025 19:52

vedithal-amd requested a review from feizheng10 July 17, 2025 19:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve baseline comparison logic #817

Improve baseline comparison logic #817

Uh oh!

vedithal-amd commented Jul 17, 2025 •

edited

Loading

Uh oh!

feizheng10 left a comment •

edited

Loading

Uh oh!

feizheng10 commented Jul 17, 2025

Uh oh!

vedithal-amd commented Jul 17, 2025 •

edited

Loading

Uh oh!

vedithal-amd commented Jul 17, 2025 •

edited

Loading

Uh oh!

vedithal-amd commented Jul 17, 2025

Uh oh!

vedithal-amd commented Jul 18, 2025

Uh oh!

Uh oh!

Improve baseline comparison logic #817

Are you sure you want to change the base?

Improve baseline comparison logic #817

Uh oh!

Conversation

vedithal-amd commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

rocprofiler-compute Pull Request

Related Issue

What type of PR is this? (check all that apply)

Technical Details

Have you added or updated tests to validate functionality?

Added / Updated documentation?

Have you updated CHANGELOG?

Uh oh!

feizheng10 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

feizheng10 commented Jul 17, 2025

Uh oh!

vedithal-amd commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vedithal-amd commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vedithal-amd commented Jul 17, 2025

Uh oh!

vedithal-amd commented Jul 18, 2025

Uh oh!

Uh oh!

vedithal-amd commented Jul 17, 2025 •

edited

Loading

feizheng10 left a comment •

edited

Loading

vedithal-amd commented Jul 17, 2025 •

edited

Loading

vedithal-amd commented Jul 17, 2025 •

edited

Loading