Skip to content

Conversation

@wangye805
Copy link

Motivation

MHA api refactoring brings AITER_ASM_DIR back which was removed before in #1597. This can bring back the issues in https://ontrack-internal.amd.com/browse/SWDEV-565755 for future TE + aiter mha backend

Technical Details

Add compile-time envs AITER_USE_AITER_ASM_DIR and AITER_REL_PATH_LIB_TO_HSA to search hsaco from a pre-agreed relative path from the compiled lib*.so. Therefore downstream libraries like TE/jax-aiter do not need to set AITER_ASM_DIR anymore.

AITER_USE_AITER_ASM_DIR: default value 1, which is the original flow which user still need to set the AITER_ASM_DIR env. If set to 0, the compilation goes the proposed flow
AITER_REL_PATH_LIB_TO_HSA: default value "./", which means that aiter lib*.so's (libmha_fwd/bwd.so) need to be under the same location with hsa dir

Test Plan

Testing aiter optest without specifying AITER_ASM_DIR

Test Result

Passed

Old flow not changed:

root@ctr-cx64-mi300x-13:/workspace/aiter_main/op_tests/cpp/mha# bash build_mha.sh bwd_v3
######## building mha kernel bwd_v3
aiter is not installed.
######## linking mha bwd
root@ctr-cx64-mi300x-13:/workspace/aiter_main/op_tests/cpp/mha# AITER_ASM_DIR=/workspace/aiter_main/hsa/gfx942/ LD_LIBRARY_PATH=./ ./bwd.exe -bwd_v3=1 -v=0
[fp16|batch|bhsd] b:2, h:8/8, s:3328/3328, d:128/128, scale:0.0883883, bias:n, dbias:0, p_drop:0, s_randval:0, deterministic:0, mask:n
[aiter] hipModuleLoad: /workspace/aiter_main/hsa/gfx942/fmha_v3_bwd/bwd_hd128_odo_fp16.co GetFunction: _ZN5aiter23fmha_bwd_hd128_odo_fp16E Success
[aiter] hipModuleLoad: /workspace/aiter_main/hsa/gfx942/fmha_v3_bwd/bwd_hd128_fp16_a32_psskddv.co GetFunction: _ZN5aiter31fmha_bwd_hd128_fp16_a32_psskddvE Success
[aiter] hipModuleLoad: /workspace/aiter_main/hsa/gfx942/fmha_v3_bwd/bwd_hd128_dq_convert_fp16.co GetFunction: _ZN5aiter30fmha_bwd_hd128_dq_convert_fp16E Success
, 0.541 ms, 419.34 TFlops, 202.00 GB/s

New flow works fine:

root@ctr-cx64-mi300x-13:/workspace/aiter_main/op_tests/cpp/mha# AITER_USE_AITER_ASM_DIR=0 AITER_REL_PATH_LIB_TO_HSA="../../../" bash build_mha.sh bwd_v3
######## building mha kernel bwd_v3
aiter is not installed.
######## linking mha bwd
root@ctr-cx64-mi300x-13:/workspace/aiter_main/op_tests/cpp/mha# LD_LIBRARY_PATH=./ ./bwd.exe -bwd_v3=1 -v=0
[fp16|batch|bhsd] b:2, h:8/8, s:3328/3328, d:128/128, scale:0.0883883, bias:n, dbias:0, p_drop:0, s_randval:0, deterministic:0, mask:n
[aiter] hipModuleLoad: ./../../../hsa/gfx942/fmha_v3_bwd/bwd_hd128_odo_fp16.co GetFunction: _ZN5aiter23fmha_bwd_hd128_odo_fp16E Success
[aiter] hipModuleLoad: ./../../../hsa/gfx942/fmha_v3_bwd/bwd_hd128_fp16_a32_psskddv.co GetFunction: _ZN5aiter31fmha_bwd_hd128_fp16_a32_psskddvE Success
[aiter] hipModuleLoad: ./../../../hsa/gfx942/fmha_v3_bwd/bwd_hd128_dq_convert_fp16.co GetFunction: _ZN5aiter30fmha_bwd_hd128_dq_convert_fp16E Success
, 0.534 ms, 424.71 TFlops, 204.58 GB/s

Submission Checklist

@wangye805 wangye805 requested review from a team, minmengdie, slippedJim and valarLip January 12, 2026 04:55
@wangye805 wangye805 force-pushed the yewang12/rel_path_lib_to_hsa branch from d4c8701 to 4a40109 Compare January 12, 2026 05:01
@yuguo68
Copy link
Contributor

yuguo68 commented Jan 17, 2026

ah, ye ge? great to see you here.

@yuguo68
Copy link
Contributor

yuguo68 commented Jan 17, 2026

I am updating aiter in PyTorch, and also need a way to avoid specifying env var AITER_ASM_DIR. made a proposal in #1862

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants