Open
Description
Describe the bug
I tried to profile a script that contains a main method like this
if __name__ == '__main__':
num_processes = 8
torch.multiprocessing.spawn(test_loop, args=(num_processes, ), nprocs=num_processes)
command
/opt/rocm/bin/rocprof-compute profile -n perf_data -- python3 ./test.py
Error:
The profiler crashes after one iteration saying "An instance of rocprof is already running"
Workaround(or single rank hack) for now
# Launch following on RANK 0
/opt/rocm/bin/rocprof-compute profile -n perf_data -- python3 ./test.py
# Keep closing and relaunching following on the reamining launch for every iteration of /opt/rocm/bin/rocprof-compute
python3 ./test.py
Linux Distribution
Ubuntu 22.04
ROCm Compute Profiler Version
3.0.0
GPU
MI300X
ROCm Version
No response
Cluster name (if applicable)
No response
Reproducer
a script that contains a main method like this
if __name__ == '__main__':
num_processes = 8
torch.multiprocessing.spawn(test_loop, args=(num_processes, ), nprocs=num_processes)
command
/opt/rocm/bin/rocprof-compute profile -n perf_data -- python3 ./test.py
Expected behavior
No response
Relevant log output
Screenshots
No response
Additional Context
No response