This repo contains a test to demonstrate a problem where all of the expected waves
do not show up in rocgdb. I am not sure if this is a problem with rocgdb or a bug
in the synchronization function (wait_for_all_threads) in the hip program.
The hip program multi-wave.hip creates a dispatch with 8 blocks each of which has
2 waves. The wait_for_all_threads is intended to synchronize across all the blocks
in the dispatch to wait for all the waves to be launched. Once all waves are launched the threads are released and will hit a breakpoint in gpu kernel.
When the breakpoint is hit we get the current set of threads that represent the waves and exit with an error if the number of threads is not equal to 16.
- Adjust the
hipccvariable inbuild.ninjato point to your hipcc binary. - Adjust the
ROCGDBvariable instress.shto point to your rocgdb binary.
$ ninja
$ ./stress.sh 100
Run #1:
===================TEST CONFIGURATION===================
Running on device: AMD Instinct MI300X
Wave size: 64
Total number of threads: 960
Total number of blocks: 8
Total number of waves: 16
Threads per block: 120
Waves per block: 2
Threads per wave: [64, 56]
===================TEST CONFIGURATION===================
Found 16 threads
...
Run #12:
...
Found 2 threads
Command failed with exit code 1. Stopping.